ticki/eudex
A blazingly fast phonetic reduction/hashing algorithm.
This project helps you quickly identify words that sound alike, even if spelled differently, across various European languages. You input a list of words, and it provides a unique phonetic code for each. This is useful for anyone working with large text datasets, like data analysts, linguists, or search engineers, who need to find or group similar-sounding terms.
221 stars and 4,387 monthly downloads. No commits in the last 6 months.
Use this if you need a very fast way to find potentially similar-sounding words in large text collections, especially for tasks like spell-checking, fuzzy searching, or deduplicating entries where exact matches aren't enough.
Not ideal if you need to calculate precise differences between words based on exact spelling or detailed linguistic analysis, as it focuses on broad phonetic similarity rather than exact distance.
Stars
221
Forks
11
Language
Rust
License
MIT
Category
Last pushed
Aug 16, 2021
Monthly downloads
4,387
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/ticki/eudex"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
PyThaiNLP/nlpo3
Thai natural language processing library in Rust, with Python and Node bindings.
forzagreen/n2words
Convert numerical numbers to written numbers, in 52+ languages.
greyblake/whatlang-rs
Natural language detection library for Rust. Try demo online: https://whatlang.org/
wikimedia/sentencex
A sentence segmentation library with wide language support optimized for speed and utility.
pemistahl/lingua-rs
The most accurate natural language detection library for Rust, suitable for short text and...