tianyu-z/VCR
Official Repo for the paper: VCR: Visual Caption Restoration. Check arxiv.org/pdf/2406.06462 for details.
This project offers datasets and tools for evaluating how well AI models can read and understand text embedded within images, especially when parts of the text are hidden. It takes an image with a partially obscured caption and aims to restore the missing text. This is useful for researchers and developers building and assessing advanced AI models that process both images and text.
No commits in the last 6 months.
Use this if you are developing or evaluating large vision-language models and need a robust benchmark for their ability to interpret text within complex visual contexts.
Not ideal if you are looking for a ready-to-use application for optical character recognition (OCR) or a solution to repair damaged images.
Stars
32
Forks
3
Language
Python
License
CC-BY-SA-4.0
Category
Last pushed
Feb 26, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/computer-vision/tianyu-z/VCR"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.