Nithin-GK/MaxFusion

[ECCV'24] MaxFusion: Plug & Play multimodal generation in text to image diffusion models

/ 100

Emerging

MaxFusion helps digital artists and content creators combine various creative influences to generate unique images. You provide descriptive text alongside other guiding inputs like sketches or reference images, and it outputs a high-quality image that blends all these elements. It's for anyone in creative fields who wants more control and flexibility in generating visual content.

No commits in the last 6 months.

Use this if you need to generate images that incorporate multiple, sometimes conflicting, creative inputs beyond just text prompts.

Not ideal if you only need basic text-to-image generation or are looking for a tool that requires extensive custom model training.

digital art content creation image generation graphic design creative workflows

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 7 / 25

Maturity 16 / 25

Community 7 / 25

How are scores calculated?

Stars

Forks

Language

Jupyter Notebook

License

MIT

Higher-rated alternatives

UCSC-VLAA/story-iter

[ICLR 2026] A Training-free Iterative Framework for Long Story Visualization

PaddlePaddle/PaddleMIX

Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks,...

keivalya/mini-vla

a minimal, beginner-friendly VLA to show how robot policies can fuse images, text, and states to...

adobe-research/custom-diffusion

Custom Diffusion: Multi-Concept Customization of Text-to-Image Diffusion (CVPR 2023)

byliutao/1Prompt1Story

🔥ICLR 2025 (Spotlight) One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation...

Explore Diffusion Models

All categories Trending Diffusion directory Insights