pmh9960/GCDP

Official PyTorch implementation of "Learning to Generate Semantic Layouts for Higher Text-Image Correspondence in Text-to-Image Synthesis." (ICCV 2023)

/ 100

Experimental

This project helps graphic designers, content creators, or researchers who need to generate realistic images from text descriptions, especially when working with specialized datasets like urban scenes or celebrity faces. It takes a text prompt and generates both a high-quality image and a corresponding 'semantic layout' (like a mask showing where objects are). This ensures the generated image accurately matches the text description, even with limited training data.

No commits in the last 6 months.

Use this if you need to generate highly accurate images from text descriptions within specific domains where large text-image datasets are not available, such as for architectural visualizations or character design.

Not ideal if you are working with generic image generation where web-scale datasets are abundant, as its primary benefit is in improving correspondence for niche domains.

Image Generation Content Creation Computer Vision Research Digital Art Semantic Segmentation

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 8 / 25

Maturity 8 / 25

Community 5 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

Vchitect/VBench

[CVPR2024 Highlight] VBench - We Evaluate Video Generation

VectorSpaceLab/OmniGen

OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340

EndlessSora/focal-frequency-loss

[ICCV 2021] Focal Frequency Loss for Image Reconstruction and Synthesis

JIA-Lab-research/DreamOmni2

This project is the official implementation of 'DreamOmni2: Multimodal Instruction-based Editing...

SkyworkAI/UniPic

Open-source SOTA multi-image editing model

Explore Diffusion Models

All categories Trending Diffusion directory Insights