felixdittrich92/docling-OCR-OnnxTR

OnnxTR OCR plugin for Docling

/ 100

Emerging

This tool helps convert PDFs and other document images into editable text. It uses advanced optical character recognition (OCR) to accurately extract text from your documents, even if they're complex or scanned. The output is structured text that can be easily searched, copied, and integrated into other systems, making it ideal for researchers, data entry specialists, or anyone dealing with large volumes of digital documents.

Available on PyPI.

Use this if you need to quickly and accurately extract text from scanned documents, images, or PDFs to make them searchable or editable.

Not ideal if you primarily need to process handwritten notes or highly stylized text, as its focus is on efficiency and standard document layouts.

document-processing data-extraction digital-archiving research-automation

Maintenance 10 / 25

Adoption 6 / 25

Maturity 25 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

Apache-2.0

Higher-rated alternatives

JaidedAI/EasyOCR

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin,...

breezedeus/CnSTD

CnSTD: 基于 PyTorch/MXNet 的中文/英文场景文字检测（Scene Text Detection）、数学公式检测（Mathematical Formula...

githubharald/SimpleHTR

Handwritten Text Recognition (HTR) system implemented with TensorFlow.

felixdittrich92/OnnxTR

OnnxTR a docTR (Document Text Recognition) library Onnx pipeline wrapper - for seamless,...

mindee/doctr

docTR (Document Text Recognition) - a seamless, high-performing & accessible library for...

Explore ML Frameworks

All categories Trending ML Framework directory Insights