htaghizadeh/JPersianStemmer
Persian stemmer
This tool helps you analyze Persian text by reducing words to their core meaning, removing suffixes and prefixes like plural markers or verb conjugations. You input Persian words, and it outputs their base form, making it easier to count unique concepts or compare texts. It's useful for linguists, researchers, or anyone working with large volumes of Persian language data.
No commits in the last 6 months.
Use this if you need to process Persian text for analysis, search, or information retrieval and require words to be standardized to their root form.
Not ideal if you are working with languages other than Persian or if you need to preserve the full grammatical form of words.
Stars
15
Forks
4
Language
Java
License
GPL-3.0
Category
Last pushed
Jun 18, 2018
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/htaghizadeh/JPersianStemmer"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
hplt-project/sacremoses
Python port of Moses tokenizer, truecaser and normalizer
Blake-Madden/OleanderStemmingLibrary
Porter stemming library (C++)
adbar/simplemma
Simple multilingual lemmatizer for Python, especially useful for speed and efficiency
htaghizadeh/PersianStemmer-Python
PersianStemmer-Python
michmech/lemmatization-lists
Machine-readable lists of lemma-token pairs in 23 languages.