nttmdlab-nlp/SlideVQA

SlideVQA: A Dataset for Document Visual Question Answering on Multiple Images (AAAI2023)

36
/ 100
Emerging

This project provides a comprehensive dataset for training and evaluating systems that can answer questions based on information presented across multiple slide images, like those found in a presentation deck. It takes a slide deck (collection of images) and a question, then identifies relevant slides and provides the answer. This is useful for researchers and developers working on intelligent document analysis and visual question answering systems.

105 stars. No commits in the last 6 months.

Use this if you are developing or evaluating AI models that need to extract information and answer questions from multi-page documents, specifically slide presentations.

Not ideal if you are looking for an out-of-the-box application to answer questions from your own slide decks, as this is a dataset for model development, not an end-user tool.

document-intelligence visual-question-answering slide-analysis multi-document-processing information-extraction
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 9 / 25
Maturity 16 / 25
Community 11 / 25

How are scores calculated?

Stars

105

Forks

8

Language

Python

License

Last pushed

Mar 31, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/computer-vision/nttmdlab-nlp/SlideVQA"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.