TonyLianLong/LLM-groundedDiffusion

LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models (LLM-grounded Diffusion: LMD, TMLR 2024)

/ 100

Emerging

This project helps graphic designers, marketers, and artists create highly specific images from text descriptions. You provide a detailed text prompt, and the system intelligently interprets it to generate an image that precisely matches your vision, including the placement of objects. The output is a high-quality image that accurately reflects your instructions.

481 stars. No commits in the last 6 months.

Use this if you need to generate images from text prompts and require precise control over the objects, their locations, and the overall composition, moving beyond the general interpretations of standard text-to-image tools.

Not ideal if you're looking for a simple, one-click image generation tool where general creative interpretation is acceptable, or if you prefer a workflow that doesn't involve detailed text prompting for layout.

digital-art graphic-design content-creation visual-marketing image-generation

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 8 / 25

Community 15 / 25

How are scores calculated?

Stars

481

Forks

Language

Python

License

—

Compare

LLM-groundedDiffusion and LLM-groundedVideoDiffusion

Higher-rated alternatives

ljleb/sd-mecha

Executable State Dict Recipes

SJTU-DENG-Lab/Discrete-Diffusion-Forcing

Discrete Diffusion Forcing (D2F): dLLMs Can Do Faster-Than-AR Inference

declare-lab/tango

A family of diffusion models for text-to-audio generation.

Li-Jinsong/DAEDAL

[ICLR 2026] Official repository of "Beyond Fixed: Training-Free Variable-Length Denoising for...

SalesforceAIResearch/CoDA

Salesforce AI Research's open diffusion language model

Explore Diffusion Models

All categories Trending Diffusion directory Insights