proycon/analiticcl
an approximate string matching or fuzzy-matching system for spelling correction, normalisation or post-OCR correction (mirror of https://codeberg.org/proycon/analiticcl)
This tool helps linguists, historians, or data entry specialists clean up messy text by finding and correcting misspelled words, transcription errors, or OCR mistakes. You provide a list of words or entire documents along with a reference dictionary, and it outputs suggested corrections and their likelihood scores. It's designed for anyone working with large volumes of text that might contain variations or errors.
Use this if you need to quickly identify and correct spelling inconsistencies or errors in historical documents, scanned texts (OCR), or user-generated content against a known vocabulary.
Not ideal if you're primarily looking for semantic similarity between words or phrases rather than spelling variants, or if you need to process highly structured data with precise matching requirements.
Stars
37
Forks
4
Language
Rust
License
GPL-3.0
Category
Last pushed
Feb 10, 2026
Monthly downloads
70
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/proycon/analiticcl"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
PyThaiNLP/nlpo3
Thai natural language processing library in Rust, with Python and Node bindings.
forzagreen/n2words
Convert numerical numbers to written numbers, in 52+ languages.
greyblake/whatlang-rs
Natural language detection library for Rust. Try demo online: https://whatlang.org/
wikimedia/sentencex
A sentence segmentation library with wide language support optimized for speed and utility.
pemistahl/lingua-rs
The most accurate natural language detection library for Rust, suitable for short text and...