proycon/analiticcl

an approximate string matching or fuzzy-matching system for spelling correction, normalisation or post-OCR correction (mirror of https://codeberg.org/proycon/analiticcl)

47
/ 100
Emerging

This tool helps linguists, historians, or data entry specialists clean up messy text by finding and correcting misspelled words, transcription errors, or OCR mistakes. You provide a list of words or entire documents along with a reference dictionary, and it outputs suggested corrections and their likelihood scores. It's designed for anyone working with large volumes of text that might contain variations or errors.

Use this if you need to quickly identify and correct spelling inconsistencies or errors in historical documents, scanned texts (OCR), or user-generated content against a known vocabulary.

Not ideal if you're primarily looking for semantic similarity between words or phrases rather than spelling variants, or if you need to process highly structured data with precise matching requirements.

text-normalization spelling-correction OCR-post-correction digital-humanities data-cleaning
No Package No Dependents
Maintenance 10 / 25
Adoption 11 / 25
Maturity 16 / 25
Community 10 / 25

How are scores calculated?

Stars

37

Forks

4

Language

Rust

License

GPL-3.0

Last pushed

Feb 10, 2026

Monthly downloads

70

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/proycon/analiticcl"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.