Psarpei/Multi-Type-TD-TSR
Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition
This tool helps automate the extraction of data from tables found within scanned documents or images. It takes an image of a document as input and provides the detected tables along with their internal structure (rows and columns) in a machine-readable format. Anyone who regularly needs to digitize information from physical documents, like data entry specialists, archivists, or business analysts, would find this useful.
282 stars. No commits in the last 6 months.
Use this if you frequently need to convert data from tables in document images (including those that are rotated or noisy) into an organized, editable format without manual transcription.
Not ideal if your primary need is general text extraction (OCR) without a focus on table structure, or if you only deal with already digital, machine-readable tables.
Stars
282
Forks
53
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Sep 05, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/Psarpei/Multi-Type-TD-TSR"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related frameworks
Layout-Parser/layout-parser
A Unified Toolkit for Deep Learning Based Document Image Analysis
Sudhanshu1304/table-transformer
🔍 Table Extraction Tool: A powerful open-source solution combining OCR and computer vision for...
ses4255/Versatile-OCR-Program
Multi-modal OCR pipeline optimized for ML training (text, figure, math, tables, diagrams)
asagar60/TableNet-pytorch
Pytorch Implementation of TableNet
JG1VPP/MuTabNet
ICDAR 2024 Table OCR Model