mshka/farsi_processor
Farsi processor is a Ruby gem to process (stem and normalize) Persian/Farsi text
This helps Ruby developers prepare Farsi (Persian) text for analysis or search by standardizing characters and removing grammatical endings. It takes raw Farsi text containing various character forms and suffixes and outputs a cleaned, normalized version. Any Ruby developer building applications that process Farsi text would find this useful.
No commits in the last 6 months.
Use this if you are a Ruby developer building an application that needs to normalize and stem Farsi text for tasks like search, text analysis, or natural language processing.
Not ideal if you are not a Ruby developer or if your text processing needs extend beyond basic normalization and stemming (e.g., sentiment analysis, translation).
Stars
8
Forks
1
Language
Ruby
License
MIT
Category
Last pushed
Jan 01, 2018
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/mshka/farsi_processor"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
hplt-project/sacremoses
Python port of Moses tokenizer, truecaser and normalizer
Blake-Madden/OleanderStemmingLibrary
Porter stemming library (C++)
adbar/simplemma
Simple multilingual lemmatizer for Python, especially useful for speed and efficiency
htaghizadeh/PersianStemmer-Python
PersianStemmer-Python
michmech/lemmatization-lists
Machine-readable lists of lemma-token pairs in 23 languages.