proycon/analiticcl

an approximate string matching or fuzzy-matching system for spelling correction, normalisation or post-OCR correction (mirror of https://codeberg.org/proycon/analiticcl)

/ 100

Emerging

This tool helps linguists, historians, or data entry specialists clean up messy text by finding and correcting misspelled words, transcription errors, or OCR mistakes. You provide a list of words or entire documents along with a reference dictionary, and it outputs suggested corrections and their likelihood scores. It's designed for anyone working with large volumes of text that might contain variations or errors.

Use this if you need to quickly identify and correct spelling inconsistencies or errors in historical documents, scanned texts (OCR), or user-generated content against a known vocabulary.

Not ideal if you're primarily looking for semantic similarity between words or phrases rather than spelling variants, or if you need to process highly structured data with precise matching requirements.

text-normalization spelling-correction OCR-post-correction digital-humanities data-cleaning

No Package No Dependents

Maintenance 10 / 25

Adoption 11 / 25

Maturity 16 / 25

Community 10 / 25

How are scores calculated?

Stars

Forks

Language

Rust

License

GPL-3.0

Higher-rated alternatives

PyThaiNLP/nlpo3

Thai natural language processing library in Rust, with Python and Node bindings.

forzagreen/n2words

Convert numerical numbers to written numbers, in 52+ languages.

greyblake/whatlang-rs

Natural language detection library for Rust. Try demo online: https://whatlang.org/

wikimedia/sentencex

A sentence segmentation library with wide language support optimized for speed and utility.

pemistahl/lingua-rs

The most accurate natural language detection library for Rust, suitable for short text and...

Explore NLP Tools

All categories Trending NLP directory Insights