ses4255/Versatile-OCR-Program
Multi-modal OCR pipeline optimized for ML training (text, figure, math, tables, diagrams)
This system helps students and researchers convert complex educational documents like exam papers into structured, AI-ready data. It takes PDFs containing multilingual text, math equations, tables, and diagrams, and outputs semantically enriched JSON or Markdown. The output includes natural language descriptions of images and tables, making it easier to create high-quality datasets for training machine learning models.
682 stars. No commits in the last 6 months.
Use this if you need to extract and semantically annotate content from scientific or academic PDFs, especially those with dense layouts, for machine learning training or advanced study.
Not ideal if you're looking for a simple OCR to digitize basic text documents or an out-of-the-box solution that doesn't require further processing or integration into an ML workflow.
Stars
682
Forks
49
Language
Python
License
—
Category
Last pushed
May 20, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/ses4255/Versatile-OCR-Program"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Psarpei/Multi-Type-TD-TSR
Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and...
Layout-Parser/layout-parser
A Unified Toolkit for Deep Learning Based Document Image Analysis
Sudhanshu1304/table-transformer
🔍 Table Extraction Tool: A powerful open-source solution combining OCR and computer vision for...
asagar60/TableNet-pytorch
Pytorch Implementation of TableNet
JG1VPP/MuTabNet
ICDAR 2024 Table OCR Model