bhattbhavesh91/DocTR-OCR-tutorial

This repository contains a notebook to demonstrate the power of Document Text Recognition (DocTR) library

/ 100

Emerging

This project helps you extract text from scanned documents, images, and PDFs. It takes an image or PDF file as input and outputs the text content found within it. This is useful for anyone who needs to convert physical documents or images of text into editable or searchable digital text, such as data entry clerks, researchers, or archivists.

No commits in the last 6 months.

Use this if you need to quickly and accurately digitize text from images or non-searchable PDF documents.

Not ideal if you primarily work with already searchable digital text documents or require highly specialized handwriting recognition.

document-digitization data-extraction information-capture content-conversion records-management

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 6 / 25

Maturity 16 / 25

Community 15 / 25

How are scores calculated?

Stars

Forks

Language

Jupyter Notebook

License

Apache-2.0

Higher-rated alternatives

JaidedAI/EasyOCR

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin,...

breezedeus/CnSTD

CnSTD: 基于 PyTorch/MXNet 的中文/英文场景文字检测（Scene Text Detection）、数学公式检测（Mathematical Formula...

githubharald/SimpleHTR

Handwritten Text Recognition (HTR) system implemented with TensorFlow.

felixdittrich92/OnnxTR

OnnxTR a docTR (Document Text Recognition) library Onnx pipeline wrapper - for seamless,...

mindee/doctr

docTR (Document Text Recognition) - a seamless, high-performing & accessible library for...

Explore ML Frameworks

All categories Trending ML Framework directory Insights