Layout-Parser/layout-parser
A Unified Toolkit for Deep Learning Based Document Image Analysis
This toolkit helps you automatically understand and extract information from scanned documents and images. You provide document images or PDFs, and it outputs structured information about the layout, like where text, titles, images, or tables are located on the page. It's ideal for data analysts, researchers, and operations professionals who deal with large volumes of documents.
5,678 stars. No commits in the last 6 months.
Use this if you need to automate the process of categorizing, extracting, or searching information within scanned documents, converting unstructured image data into an organized format.
Not ideal if you only need basic text extraction from simple documents without complex layouts, or if you require a ready-to-use application rather than a programming toolkit.
Stars
5,678
Forks
525
Language
Python
License
Apache-2.0
Category
Last pushed
Aug 15, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/Layout-Parser/layout-parser"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Psarpei/Multi-Type-TD-TSR
Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and...
Sudhanshu1304/table-transformer
🔍 Table Extraction Tool: A powerful open-source solution combining OCR and computer vision for...
asagar60/TableNet-pytorch
Pytorch Implementation of TableNet
ses4255/Versatile-OCR-Program
Multi-modal OCR pipeline optimized for ML training (text, figure, math, tables, diagrams)
JG1VPP/MuTabNet
ICDAR 2024 Table OCR Model