YangLing0818/RPG-DiffusionMaster

[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (RPG)

43
/ 100
Emerging

This project helps artists, designers, and marketers generate highly detailed and specific images from complex text descriptions. You input a detailed text prompt, and it outputs a high-resolution image that precisely matches your description, even with multiple objects and scenes. It's designed for anyone who needs to create accurate visual content from complex textual ideas.

1,843 stars. No commits in the last 6 months.

Use this if you need to generate images that perfectly match intricate, multi-part text descriptions, ensuring all elements and their relationships are accurately depicted.

Not ideal if you only need simple image generation from basic prompts or prefer not to integrate with powerful external multimodal AI models.

digital art graphic design content creation visual storytelling marketing visuals
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 17 / 25

How are scores calculated?

Stars

1,843

Forks

103

Language

Jupyter Notebook

License

MIT

Last pushed

Feb 01, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/YangLing0818/RPG-DiffusionMaster"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.