TonyLianLong/LLM-groundedDiffusion
LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models (LLM-grounded Diffusion: LMD, TMLR 2024)
This project helps graphic designers, marketers, and artists create highly specific images from text descriptions. You provide a detailed text prompt, and the system intelligently interprets it to generate an image that precisely matches your vision, including the placement of objects. The output is a high-quality image that accurately reflects your instructions.
481 stars. No commits in the last 6 months.
Use this if you need to generate images from text prompts and require precise control over the objects, their locations, and the overall composition, moving beyond the general interpretations of standard text-to-image tools.
Not ideal if you're looking for a simple, one-click image generation tool where general creative interpretation is acceptable, or if you prefer a workflow that doesn't involve detailed text prompting for layout.
Stars
481
Forks
34
Language
Python
License
—
Category
Last pushed
Sep 09, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/TonyLianLong/LLM-groundedDiffusion"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ljleb/sd-mecha
Executable State Dict Recipes
SJTU-DENG-Lab/Discrete-Diffusion-Forcing
Discrete Diffusion Forcing (D2F): dLLMs Can Do Faster-Than-AR Inference
declare-lab/tango
A family of diffusion models for text-to-audio generation.
Li-Jinsong/DAEDAL
[ICLR 2026] Official repository of "Beyond Fixed: Training-Free Variable-Length Denoising for...
SalesforceAIResearch/CoDA
Salesforce AI Research's open diffusion language model