FreeOCR-AI/layoutreader

A Faster LayoutReader Model based on LayoutLMv3, Sort OCR bboxes to reading order.

42
/ 100
Emerging

This tool helps you accurately extract text from scanned documents, PDFs, or images by determining the correct reading order of content blocks. It takes in the bounding box locations of text sections and outputs the numerical sequence in which they should be read, ensuring a coherent and logical reconstruction of the document's content. Anyone who needs to process large volumes of unstructured document images, such as data entry specialists, archivists, or legal professionals, will find this project useful.

314 stars. No commits in the last 6 months.

Use this if you need to reliably convert visual document layouts into sequential, readable text, especially from diverse or complex document types.

Not ideal if your primary need is simple OCR without complex layout understanding, or if you only process documents with very basic, uniform layouts.

document-processing data-extraction digital-archiving information-capture content-digitization
Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 14 / 25

How are scores calculated?

Stars

314

Forks

26

Language

Python

License

Last pushed

Aug 15, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/FreeOCR-AI/layoutreader"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.