matsuolab/multibanana
[CVPR 2026 Main] MultiBanana: A Challenging Benchmark for Multi-Reference Text-to-Image Generation
This project provides a standardized way to test how well AI models can create new images based on multiple existing images and a text description. It takes a collection of reference images and a text prompt, then evaluates the AI-generated image against them. This is for AI researchers and developers who are building or evaluating text-to-image generation models, ensuring their models accurately reflect the provided references.
Use this if you are developing or comparing text-to-image AI models and need a reliable, consistent method to benchmark their performance, especially in scenarios requiring adherence to multiple visual references.
Not ideal if you are an end-user looking to generate images yourself, rather than to develop or evaluate the underlying AI generation models.
Stars
20
Forks
—
Language
Python
License
—
Category
Last pushed
Mar 17, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/matsuolab/multibanana"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Vchitect/VBench
[CVPR2024 Highlight] VBench - We Evaluate Video Generation
VectorSpaceLab/OmniGen
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
EndlessSora/focal-frequency-loss
[ICCV 2021] Focal Frequency Loss for Image Reconstruction and Synthesis
JIA-Lab-research/DreamOmni2
This project is the official implementation of 'DreamOmni2: Multimodal Instruction-based Editing...
SkyworkAI/UniPic
Open-source SOTA multi-image editing model