showlab/VisorGPT

[NeurIPS 2023] Customize spatial layouts for conditional image synthesis models, e.g., ControlNet, using GPT

/ 100

Emerging

This project helps graphic designers, digital artists, and marketing professionals create custom images with precise control over object placement and visual style. You provide a text description and define regions for objects like bounding boxes, keypoints, or masks, and it generates an image that follows these spatial layouts. This allows for fine-tuned creative direction beyond just text prompts.

137 stars. No commits in the last 6 months.

Use this if you need to generate images where the exact positioning and arrangement of elements are crucial, such as for product mockups, character poses, or scene compositions.

Not ideal if you just need quick, general image generation from a text prompt without detailed spatial control.

digital art graphic design image generation creative workflows visual content creation

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 5 / 25

How are scores calculated?

Stars

137

Forks

Language

Python

License

MIT

Higher-rated alternatives

zai-org/CogVideo

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

zhaorw02/DeepMesh

[ICCV 2025] Official code of DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning

YangLing0818/RPG-DiffusionMaster

[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with...

thu-nics/FrameFusion

[ICCV'25] The official code of paper "Combining Similarity and Importance for Video Token...

Yushi-Hu/tifa

TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering

Explore Diffusion Models

All categories Trending Diffusion directory Insights