pooya-mohammadi/persian-spell-checker-kenlm

A complete instruction for training a Persian spell checker and a language model based on SymSpell and KenLM, respectively using Wikipedia dataset.

26
/ 100
Experimental

This project helps you create a custom spell checker and a language model specifically for the Persian language. You feed it a large amount of Persian text, like a Wikipedia dump, and it produces a dictionary for spell checking and a model that understands how likely certain Persian word sequences are. This is useful for anyone working with Persian text data, such as content creators, linguists, or data analysts who need to improve text quality or analyze language patterns.

No commits in the last 6 months.

Use this if you need to build a specialized spell checker or a language model for the Persian language from scratch, using a large text corpus.

Not ideal if you need a spell checker for a language other than Persian, or if you require an out-of-the-box solution without training a new model.

Persian-language text-analysis spell-checking natural-language-processing text-normalization
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 7 / 25
Maturity 16 / 25
Community 3 / 25

How are scores calculated?

Stars

35

Forks

1

Language

Python

License

MIT

Last pushed

Jul 20, 2022

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/pooya-mohammadi/persian-spell-checker-kenlm"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.