mehdidc/feed_forward_vqgan_clip
Feed forward VQGAN-CLIP model, where the goal is to eliminate the need for optimizing the latent space of VQGAN for each input prompt
This tool generates images directly from text descriptions, eliminating the need for complex parameter tuning. You provide a text prompt describing the image you want, and it produces an RGB image. It's designed for digital artists, content creators, or anyone needing to quickly visualize concepts without extensive manual image manipulation.
140 stars. No commits in the last 6 months.
Use this if you need to rapidly create unique images or visual concepts from text descriptions, or generate multiple diverse images from a single prompt.
Not ideal if you require precise control over every pixel of the output image, or if your primary need is for photo-realistic images of real-world scenes.
Stars
140
Forks
18
Language
Python
License
MIT
Category
Last pushed
Jan 03, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/mehdidc/feed_forward_vqgan_clip"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
NVlabs/Sana
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
FoundationVision/VAR
[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈]...
nerdyrodent/VQGAN-CLIP
Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.
huggingface/finetrainers
Scalable and memory-optimized training of diffusion models
AssemblyAI-Community/MinImagen
MinImagen: A minimal implementation of the Imagen text-to-image model