xuyang-liu16/VGDiffZero

[ICASSP 2024] VGDiffZero: Text-to-image Diffusion Models Can Be Zero-shot Visual Grounders

14
/ 100
Experimental

This project helps pinpoint specific objects within an image based on a descriptive text phrase, without needing to train a custom model. You provide an image and a text query (e.g., "the red car"), and it outputs the precise location of that object in the image. This is useful for researchers and practitioners working with image analysis, computer vision, and visual search who need to accurately identify and localize visual elements described by text.

No commits in the last 6 months.

Use this if you need to precisely locate objects in images using text descriptions, without the hassle of fine-tuning or training a new model.

Not ideal if your primary goal is generating new images from text or if you don't need highly specific object localization.

image-analysis visual-search object-localization computer-vision content-tagging
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 6 / 25
Maturity 8 / 25
Community 0 / 25

How are scores calculated?

Stars

17

Forks

Language

Python

License

Last pushed

Feb 11, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/xuyang-liu16/VGDiffZero"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.