KadenMc/PreprocessingHTR

Pre-processing a handwritten page into word images for Handwritten Text Recognition (HTR).

/ 100

Emerging

This tool helps researchers, historians, and archivists convert scanned images of handwritten pages into individual word images, making them ready for Handwritten Text Recognition (HTR) systems. You input a full, clear image of a handwritten page, and it outputs separate images for each word found on the page. It's designed for anyone working with historical documents or large collections of handwritten text who needs to digitize content.

No commits in the last 6 months.

Use this if you need to prepare scanned handwritten documents for automated text recognition, specifically by extracting individual word images.

Not ideal if your handwritten pages are heavily warped, have overlapping text lines, or contain less than perfect lighting and page borders, as it relies on clear page structure.

historical-document-analysis digitization archival-processing data-extraction optical-character-recognition

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 7 / 25

Maturity 16 / 25

Community 11 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

JaidedAI/EasyOCR

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin,...

breezedeus/CnSTD

CnSTD: 基于 PyTorch/MXNet 的中文/英文场景文字检测（Scene Text Detection）、数学公式检测（Mathematical Formula...

githubharald/SimpleHTR

Handwritten Text Recognition (HTR) system implemented with TensorFlow.

felixdittrich92/OnnxTR

OnnxTR a docTR (Document Text Recognition) library Onnx pipeline wrapper - for seamless,...

mindee/doctr

docTR (Document Text Recognition) - a seamless, high-performing & accessible library for...

Explore ML Frameworks

All categories Trending ML Framework directory Insights