svjack/docvqa-gen

Question Answering dataset generator of Document Visual in English and Chinese

21
/ 100
Experimental

This tool helps researchers and data scientists generate question-answer pairs directly from document images in both English and Chinese. You input an image containing text, such as a scanned document or a screenshot, and it outputs a list of relevant questions and their corresponding answers extracted from that image. It's designed for anyone working with document-based information who needs to create structured Q&A datasets.

No commits in the last 6 months.

Use this if you need to quickly create comprehensive question-answer datasets from various document images for training AI models or analyzing information.

Not ideal if you primarily work with plain text documents and do not require image-based question generation.

document-intelligence data-labeling optical-character-recognition information-extraction dataset-generation
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 6 / 25
Maturity 8 / 25
Community 7 / 25

How are scores calculated?

Stars

24

Forks

2

Language

Jupyter Notebook

License

Last pushed

Apr 17, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/svjack/docvqa-gen"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.