abdoelsayed2016/TNCR_Dataset
Deep learning, Convolutional neural networks, Image processing, Document processing, Table detection, Page object detection, Table classification. https://www.sciencedirect.com/science/article/pii/S0925231221018142
This dataset helps organize and extract information from scanned documents by identifying and categorizing tables. It takes images of documents as input and outputs detected tables, classified into different types like 'full lined' or 'no lines'. This is valuable for data entry specialists, archivists, or researchers who need to process large volumes of document scans to extract tabular data.
No commits in the last 6 months.
Use this if you need to automatically locate tables within scanned document images and categorize them by their visual structure for further processing.
Not ideal if you are working with digitally native documents (e.g., PDFs with selectable text) where table data can be extracted directly without image processing.
Stars
69
Forks
4
Language
Python
License
MIT
Category
Last pushed
Feb 24, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/abdoelsayed2016/TNCR_Dataset"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Psarpei/Multi-Type-TD-TSR
Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and...
Layout-Parser/layout-parser
A Unified Toolkit for Deep Learning Based Document Image Analysis
Sudhanshu1304/table-transformer
🔍 Table Extraction Tool: A powerful open-source solution combining OCR and computer vision for...
ses4255/Versatile-OCR-Program
Multi-modal OCR pipeline optimized for ML training (text, figure, math, tables, diagrams)
asagar60/TableNet-pytorch
Pytorch Implementation of TableNet