vpulab/ovam

Code for the paper Open-Vocabulary Attention Maps with Token Optimization for Semantic Segmentation in Diffusion Models @ CVPR 2024

35
/ 100
Emerging

This project helps graphic designers and artists understand how a text-to-image AI model, like Stable Diffusion, interprets descriptive text to create images. You provide a text prompt to generate an image, and then this tool shows you which parts of the image correspond to specific words in your prompt. This allows you to see the AI's 'thinking' behind its visual output.

No commits in the last 6 months.

Use this if you want to visualize how different words in your text prompt contribute to specific visual elements in the AI-generated image, or to refine those visual attributions.

Not ideal if you're looking for a tool to generate images directly without needing to analyze the internal workings of the diffusion model.

AI-art-generation image-interpretation prompt-engineering computer-graphics visual-AI-analysis
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 9 / 25
Maturity 16 / 25
Community 10 / 25

How are scores calculated?

Stars

71

Forks

6

Language

Python

License

MIT

Last pushed

Jun 14, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/vpulab/ovam"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.