Rishabh1925/scene-localization-system
Powerful CLIP-based computer vision system for natural language-driven object and scene localization in images. Features smart query expansion, adaptive detection, and interactive web UI.
This system helps professionals like marketers, researchers, or archivists quickly find specific objects or scenes within a collection of images using everyday language. You input an image and a text description (e.g., "red car"), and the system outputs the image with bounding boxes highlighting the detected items, along with cropped images of each detection and detailed metadata. It's designed for anyone who needs to visually analyze images without specialized technical training.
Use this if you need to precisely locate and identify objects or complex scenes within images using natural language queries, such as for content analysis, visual search, or automated tagging.
Not ideal if you need real-time object detection for live video streams or require extremely high precision for safety-critical applications, as analysis can take several minutes.
Stars
10
Forks
—
Language
HTML
License
MIT
Category
Last pushed
Oct 26, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/Rishabh1925/scene-localization-system"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
rom1504/img2dataset
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M...
devrimcavusoglu/pybboxes
Light weight toolkit for bounding boxes providing conversion between bounding box types and...
PyRetri/PyRetri
Open source deep learning based unsupervised image retrieval toolbox built on PyTorch🔥
Particle1904/DatasetHelpers
Dataset Helper program to automatically select, re scale and tag Datasets (composed of image and...
salesforce/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence