sorenlind/lemmy
🤘Lemmy is a lemmatizer for Danish 🇩🇰 and Swedish 🇸🇪
This tool helps researchers, linguists, or data analysts working with Danish or Swedish text to standardize words to their base form. You input a word, optionally with its part-of-speech tag, and it outputs the lemma (e.g., 'running' becomes 'run'). This is useful for tasks like text analysis, information retrieval, or building linguistic resources.
No commits in the last 6 months. Available on PyPI.
Use this if you need to quickly and accurately find the base form of words in Danish or Swedish texts, especially when performing linguistic analysis or preparing data for further processing.
Not ideal if you are working with languages other than Danish or Swedish, or if you need a full natural language processing pipeline that goes beyond just lemmatization.
Stars
79
Forks
9
Language
Python
License
MIT
Category
Last pushed
Sep 20, 2021
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/sorenlind/lemmy"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
hplt-project/sacremoses
Python port of Moses tokenizer, truecaser and normalizer
Blake-Madden/OleanderStemmingLibrary
Porter stemming library (C++)
adbar/simplemma
Simple multilingual lemmatizer for Python, especially useful for speed and efficiency
htaghizadeh/PersianStemmer-Python
PersianStemmer-Python
michmech/lemmatization-lists
Machine-readable lists of lemma-token pairs in 23 languages.