nttmdlab-nlp/SlideVQA
SlideVQA: A Dataset for Document Visual Question Answering on Multiple Images (AAAI2023)
This project provides a comprehensive dataset for training and evaluating systems that can answer questions based on information presented across multiple slide images, like those found in a presentation deck. It takes a slide deck (collection of images) and a question, then identifies relevant slides and provides the answer. This is useful for researchers and developers working on intelligent document analysis and visual question answering systems.
105 stars. No commits in the last 6 months.
Use this if you are developing or evaluating AI models that need to extract information and answer questions from multi-page documents, specifically slide presentations.
Not ideal if you are looking for an out-of-the-box application to answer questions from your own slide decks, as this is a dataset for model development, not an end-user tool.
Stars
105
Forks
8
Language
Python
License
—
Category
Last pushed
Mar 31, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/computer-vision/nttmdlab-nlp/SlideVQA"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.