nihaljn/multimodal-prompting
Enabling the use of multiple modalities while prompting Stable Diffusion
This tool helps creative professionals like artists, designers, or marketers generate unique images using a blend of text descriptions and reference images. You provide a prompt that mixes text with placeholders for images (e.g., "A tiger taking a walk on [img]") and specify the actual images, plus how much influence each image should have. The output is a new image that combines elements from your text and all provided visual references.
No commits in the last 6 months.
Use this if you want to generate images where you precisely control the output by combining conceptual text descriptions with specific visual styles or elements from existing images.
Not ideal if you prefer to generate images solely from text prompts or if you don't have specific reference images to guide your creation.
Stars
15
Forks
3
Language
Python
License
—
Category
Last pushed
Oct 10, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/nihaljn/multimodal-prompting"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
neggles/animatediff-cli
a CLI utility/library for AnimateDiff stable diffusion generation
sakalond/StableGen
Transform your 3D texturing workflow with the power of generative AI, directly within Blender!
victordibia/peacasso
UI interface for experimenting with multimodal (text, image) models (stable diffusion).
ai-forever/Kandinsky-2
Kandinsky 2 — multilingual text2image latent diffusion model
carefree0910/carefree-drawboard
🎨 Infinite Drawboard in Python