KadenMc/PreprocessingHTR

Pre-processing a handwritten page into word images for Handwritten Text Recognition (HTR).

34
/ 100
Emerging

This tool helps researchers, historians, and archivists convert scanned images of handwritten pages into individual word images, making them ready for Handwritten Text Recognition (HTR) systems. You input a full, clear image of a handwritten page, and it outputs separate images for each word found on the page. It's designed for anyone working with historical documents or large collections of handwritten text who needs to digitize content.

No commits in the last 6 months.

Use this if you need to prepare scanned handwritten documents for automated text recognition, specifically by extracting individual word images.

Not ideal if your handwritten pages are heavily warped, have overlapping text lines, or contain less than perfect lighting and page borders, as it relies on clear page structure.

historical-document-analysis digitization archival-processing data-extraction optical-character-recognition
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 7 / 25
Maturity 16 / 25
Community 11 / 25

How are scores calculated?

Stars

31

Forks

4

Language

Python

License

MIT

Last pushed

Dec 16, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/KadenMc/PreprocessingHTR"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.