LeonieWeissweiler/CISTEM
Stemmer for German
When analyzing German text, this tool helps you normalize words by reducing them to their core 'stem' form, like changing 'walking' to 'walk'. It takes German words as input and outputs their root forms, making it easier to group related terms. This is useful for linguists, researchers, or anyone performing text analysis on German content.
No commits in the last 6 months.
Use this if you need to process German text to find common word roots for tasks like search, data analysis, or linguistic studies.
Not ideal if you need a tool for languages other than German, or if you require full morphological analysis beyond just stemming.
Stars
45
Forks
10
Language
C
License
MIT
Category
Last pushed
Apr 29, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/LeonieWeissweiler/CISTEM"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
hplt-project/sacremoses
Python port of Moses tokenizer, truecaser and normalizer
Blake-Madden/OleanderStemmingLibrary
Porter stemming library (C++)
adbar/simplemma
Simple multilingual lemmatizer for Python, especially useful for speed and efficiency
htaghizadeh/PersianStemmer-Python
PersianStemmer-Python
michmech/lemmatization-lists
Machine-readable lists of lemma-token pairs in 23 languages.