Nithin-GK/UniteandConquer

[CVPR '23] Unite and Conquer: Plug & Play Multi-Modal Synthesis using Diffusion Models

/ 100

Emerging

This project helps graphic designers, artists, and marketers create diverse images by combining different types of input, like text descriptions, face maps, and hair masks. You provide multiple controls, such as a sketch, a description, and semantic masks, and it generates a high-quality, consistent image. This tool is for anyone who needs to quickly prototype or generate varied visual content based on complex specifications.

No commits in the last 6 months.

Use this if you need to generate images, especially faces or complex scenes, by combining specific text prompts with visual guides like sketches or masks.

Not ideal if you only need basic image generation from a single text prompt without incorporating additional visual controls.

generative-art digital-marketing concept-design visual-content-creation multimodal-design

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 7 / 25

Maturity 16 / 25

Community 8 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

UCSC-VLAA/story-iter

[ICLR 2026] A Training-free Iterative Framework for Long Story Visualization

PaddlePaddle/PaddleMIX

Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks,...

keivalya/mini-vla

a minimal, beginner-friendly VLA to show how robot policies can fuse images, text, and states to...

adobe-research/custom-diffusion

Custom Diffusion: Multi-Concept Customization of Text-to-Image Diffusion (CVPR 2023)

byliutao/1Prompt1Story

🔥ICLR 2025 (Spotlight) One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation...

Explore Diffusion Models

All categories Trending Diffusion directory Insights