tianyu-z/VCR

Official Repo for the paper: VCR: Visual Caption Restoration. Check arxiv.org/pdf/2406.06462 for details.

/ 100

Emerging

This project offers datasets and tools for evaluating how well AI models can read and understand text embedded within images, especially when parts of the text are hidden. It takes an image with a partially obscured caption and aims to restore the missing text. This is useful for researchers and developers building and assessing advanced AI models that process both images and text.

No commits in the last 6 months.

Use this if you are developing or evaluating large vision-language models and need a robust benchmark for their ability to interpret text within complex visual contexts.

Not ideal if you are looking for a ready-to-use application for optical character recognition (OCR) or a solution to repair damaged images.

AI model evaluation vision-language processing text in images machine reading comprehension

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 7 / 25

Maturity 16 / 25

Community 9 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

CC-BY-SA-4.0

Higher-rated alternatives

xISSAx/Alpha-Co-Vision

A real-time video caption to conversation bot that captures frames generates captions and...

naver-ai/eccv-caption

Extended COCO Validation (ECCV) Caption dataset (ECCV 2022)

Explore Computer Vision Tools

All categories Trending Computer Vision directory Insights