zeyofu/Commonsense-T2I
Code for Commonsense-T2I Challenge: Can Text-to-Image Generation Models Understand Commonsense? [COLM 2024]
This project helps evaluate how well text-to-image (T2I) models understand real-world common sense when generating images. You provide pairs of similar text prompts with subtle differences in commonsense context (e.g., 'lightbulb without electricity' vs. 'lightbulb with electricity'). The project then generates images and scores how accurately they reflect those everyday concepts. This is for researchers and practitioners who develop or use AI image generation models and need to assess their nuanced understanding of the world.
No commits in the last 6 months.
Use this if you are a researcher or developer who wants to benchmark how well your text-to-image models handle everyday common sense compared to other state-of-the-art models.
Not ideal if you are a general user simply looking to generate images, as this tool is focused on evaluating model performance, not casual image creation.
Stars
24
Forks
1
Language
Python
License
Apache-2.0
Category
Last pushed
Aug 13, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/zeyofu/Commonsense-T2I"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Vchitect/VBench
[CVPR2024 Highlight] VBench - We Evaluate Video Generation
VectorSpaceLab/OmniGen
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
EndlessSora/focal-frequency-loss
[ICCV 2021] Focal Frequency Loss for Image Reconstruction and Synthesis
JIA-Lab-research/DreamOmni2
This project is the official implementation of 'DreamOmni2: Multimodal Instruction-based Editing...
SkyworkAI/UniPic
Open-source SOTA multi-image editing model