limuloo/MIGC

[CVPR 2024 Highlight] MIGC and [TPAMI 2024] MIGC++ (Official Implementation)

/ 100

Emerging

This tool helps creative professionals and marketers generate specific images from text descriptions. You input text prompts along with precise masks or bounding boxes to define where objects should appear. The output is a highly controlled, custom image where multiple elements are accurately placed and consistently rendered, even allowing for iterative edits to refine specific parts of the image.

615 stars. No commits in the last 6 months.

Use this if you need fine-grained control over the placement and attributes of multiple objects in text-to-image generation, such as for scene composition or visual storytelling.

Not ideal if you're looking for a simple, quick text-to-image generator without needing to specify exact object locations or if you lack the technical background to work with image masks and bounding boxes.

digital-art content-creation visual-design marketing-asset-generation animation-production

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 13 / 25

How are scores calculated?

Stars

615

Forks

Language

Python

License

—

Higher-rated alternatives

UCSC-VLAA/story-iter

[ICLR 2026] A Training-free Iterative Framework for Long Story Visualization

PaddlePaddle/PaddleMIX

Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks,...

keivalya/mini-vla

a minimal, beginner-friendly VLA to show how robot policies can fuse images, text, and states to...

adobe-research/custom-diffusion

Custom Diffusion: Multi-Concept Customization of Text-to-Image Diffusion (CVPR 2023)

byliutao/1Prompt1Story

🔥ICLR 2025 (Spotlight) One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation...

Explore Diffusion Models

All categories Trending Diffusion directory Insights