limuloo/MIGC
[CVPR 2024 Highlight] MIGC and [TPAMI 2024] MIGC++ (Official Implementation)
This tool helps creative professionals and marketers generate specific images from text descriptions. You input text prompts along with precise masks or bounding boxes to define where objects should appear. The output is a highly controlled, custom image where multiple elements are accurately placed and consistently rendered, even allowing for iterative edits to refine specific parts of the image.
615 stars. No commits in the last 6 months.
Use this if you need fine-grained control over the placement and attributes of multiple objects in text-to-image generation, such as for scene composition or visual storytelling.
Not ideal if you're looking for a simple, quick text-to-image generator without needing to specify exact object locations or if you lack the technical background to work with image masks and bounding boxes.
Stars
615
Forks
31
Language
Python
License
—
Category
Last pushed
May 15, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/limuloo/MIGC"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
UCSC-VLAA/story-iter
[ICLR 2026] A Training-free Iterative Framework for Long Story Visualization
PaddlePaddle/PaddleMIX
Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks,...
keivalya/mini-vla
a minimal, beginner-friendly VLA to show how robot policies can fuse images, text, and states to...
adobe-research/custom-diffusion
Custom Diffusion: Multi-Concept Customization of Text-to-Image Diffusion (CVPR 2023)
byliutao/1Prompt1Story
🔥ICLR 2025 (Spotlight) One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation...