Layout-Parser/layout-parser

A Unified Toolkit for Deep Learning Based Document Image Analysis

/ 100

Emerging

This toolkit helps you automatically understand and extract information from scanned documents and images. You provide document images or PDFs, and it outputs structured information about the layout, like where text, titles, images, or tables are located on the page. It's ideal for data analysts, researchers, and operations professionals who deal with large volumes of documents.

5,678 stars. No commits in the last 6 months.

Use this if you need to automate the process of categorizing, extracting, or searching information within scanned documents, converting unstructured image data into an organized format.

Not ideal if you only need basic text extraction from simple documents without complex layouts, or if you require a ready-to-use application rather than a programming toolkit.

document-analysis information-extraction data-entry-automation digital-archives research-data-capture

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 20 / 25

How are scores calculated?

Stars

5,678

Forks

525

Language

Python

License

Apache-2.0

Higher-rated alternatives

Psarpei/Multi-Type-TD-TSR

Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and...

Sudhanshu1304/table-transformer

🔍 Table Extraction Tool: A powerful open-source solution combining OCR and computer vision for...

asagar60/TableNet-pytorch

Pytorch Implementation of TableNet

ses4255/Versatile-OCR-Program

Multi-modal OCR pipeline optimized for ML training (text, figure, math, tables, diagrams)

JG1VPP/MuTabNet

ICDAR 2024 Table OCR Model

Explore ML Frameworks

All categories Trending ML Framework directory Insights