showlab/VisorGPT
[NeurIPS 2023] Customize spatial layouts for conditional image synthesis models, e.g., ControlNet, using GPT
This project helps graphic designers, digital artists, and marketing professionals create custom images with precise control over object placement and visual style. You provide a text description and define regions for objects like bounding boxes, keypoints, or masks, and it generates an image that follows these spatial layouts. This allows for fine-tuned creative direction beyond just text prompts.
137 stars. No commits in the last 6 months.
Use this if you need to generate images where the exact positioning and arrangement of elements are crucial, such as for product mockups, character poses, or scene compositions.
Not ideal if you just need quick, general image generation from a text prompt without detailed spatial control.
Stars
137
Forks
3
Language
Python
License
MIT
Category
Last pushed
May 04, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/showlab/VisorGPT"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
zai-org/CogVideo
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
zhaorw02/DeepMesh
[ICCV 2025] Official code of DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning
YangLing0818/RPG-DiffusionMaster
[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with...
thu-nics/FrameFusion
[ICCV'25] The official code of paper "Combining Similarity and Importance for Video Token...
Yushi-Hu/tifa
TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering