naver-ai/eccv-caption

Extended COCO Validation (ECCV) Caption dataset (ECCV 2022)

/ 100

Emerging

When developing or evaluating AI models that understand both images and text, a common challenge is ensuring the model correctly associates an image with its relevant captions, and vice-versa. This project provides an extended dataset and a toolkit to measure how accurately your image-text model performs these associations. It takes your model's ranked lists of captions for an image, or images for a caption, and outputs a suite of performance metrics. This is for researchers and engineers building and benchmarking multimodal AI models.

No commits in the last 6 months. Available on PyPI.

Use this if you are evaluating the performance of your image-text matching AI model and need more accurate, human- and machine-verified ground truth data beyond the original COCO Caption dataset, along with standardized metrics.

Not ideal if you are a casual user looking for an out-of-the-box image captioning or image search solution; this is a toolkit for model evaluation, not a deployed application.

image-text-matching multimodal-AI model-evaluation computer-vision natural-language-processing

Stale 6m No Dependents

Maintenance 0 / 25

Adoption 8 / 25

Maturity 25 / 25

Community 5 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

xISSAx/Alpha-Co-Vision

A real-time video caption to conversation bot that captures frames generates captions and...

tianyu-z/VCR

Official Repo for the paper: VCR: Visual Caption Restoration. Check arxiv.org/pdf/2406.06462 for details.

Explore Computer Vision Tools

All categories Trending Computer Vision directory Insights