WZBSocialScienceCenter/germalemma
A lemmatizer for German language text
GermaLemma helps social scientists, linguists, and researchers analyze German text by reducing words to their base form, or lemma. You provide German words along with their Part-of-Speech tags, and it outputs the standardized base form of each word. This is useful for tasks like text analysis, natural language processing, or building language models.
No commits in the last 6 months. Available on PyPI.
Use this if you need to accurately convert inflected German words (nouns, verbs, adjectives, adverbs) into their base dictionary form for linguistic analysis or data processing.
Not ideal if your project requires lemmatizing parts of speech beyond nouns, verbs, adjectives, and adverbs, or if you need an actively maintained tool.
Stars
94
Forks
11
Language
Python
License
Apache-2.0
Category
Last pushed
Feb 07, 2023
Commits (30d)
0
Dependencies
2
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/WZBSocialScienceCenter/germalemma"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
hplt-project/sacremoses
Python port of Moses tokenizer, truecaser and normalizer
Blake-Madden/OleanderStemmingLibrary
Porter stemming library (C++)
adbar/simplemma
Simple multilingual lemmatizer for Python, especially useful for speed and efficiency
htaghizadeh/PersianStemmer-Python
PersianStemmer-Python
michmech/lemmatization-lists
Machine-readable lists of lemma-token pairs in 23 languages.