Nithin-GK/UniteandConquer
[CVPR '23] Unite and Conquer: Plug & Play Multi-Modal Synthesis using Diffusion Models
This project helps graphic designers, artists, and marketers create diverse images by combining different types of input, like text descriptions, face maps, and hair masks. You provide multiple controls, such as a sketch, a description, and semantic masks, and it generates a high-quality, consistent image. This tool is for anyone who needs to quickly prototype or generate varied visual content based on complex specifications.
No commits in the last 6 months.
Use this if you need to generate images, especially faces or complex scenes, by combining specific text prompts with visual guides like sketches or masks.
Not ideal if you only need basic image generation from a single text prompt without incorporating additional visual controls.
Stars
36
Forks
3
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 31, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/Nithin-GK/UniteandConquer"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
UCSC-VLAA/story-iter
[ICLR 2026] A Training-free Iterative Framework for Long Story Visualization
PaddlePaddle/PaddleMIX
Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks,...
keivalya/mini-vla
a minimal, beginner-friendly VLA to show how robot policies can fuse images, text, and states to...
adobe-research/custom-diffusion
Custom Diffusion: Multi-Concept Customization of Text-to-Image Diffusion (CVPR 2023)
byliutao/1Prompt1Story
🔥ICLR 2025 (Spotlight) One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation...