youweiliang/RichHF
Code for CVPR'24 best paper: Rich Human Feedback for Text-to-Image Generation (https://arxiv.org/pdf/2312.10240)
This project helps researchers and developers evaluate the quality of images generated from text descriptions. It takes an image and its corresponding text prompt as input, then generates detailed heatmaps showing areas of 'implausibility' or 'misalignment' and provides objective scores for plausibility, aesthetics, and text-image alignment. It's designed for machine learning researchers working on text-to-image models to understand and improve their creations.
No commits in the last 6 months.
Use this if you are a researcher developing or evaluating text-to-image generation models and need to quantify and visualize how well generated images match human perception and their text prompts.
Not ideal if you need to use this tool for commercial purposes, as both the code and model weights prohibit commercial use.
Stars
31
Forks
1
Language
Python
License
—
Category
Last pushed
Sep 05, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/youweiliang/RichHF"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
UCSC-VLAA/story-iter
[ICLR 2026] A Training-free Iterative Framework for Long Story Visualization
PaddlePaddle/PaddleMIX
Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks,...
keivalya/mini-vla
a minimal, beginner-friendly VLA to show how robot policies can fuse images, text, and states to...
adobe-research/custom-diffusion
Custom Diffusion: Multi-Concept Customization of Text-to-Image Diffusion (CVPR 2023)
byliutao/1Prompt1Story
🔥ICLR 2025 (Spotlight) One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation...