dzieciou/pystempel

Python port of Stempel, an algorithmic stemmer for Polish language.

/ 100

Emerging

When analyzing Polish text, you often need to reduce different forms of a word (like "książka," "książki," "książkami") to a common base or "stem" ("książek"). This tool takes individual Polish words and outputs their root forms, which helps in grouping similar words for more accurate text analysis. It's used by anyone working with Polish language data, such as linguists, data scientists, or search engine developers.

No commits in the last 6 months.

Use this if you need to process Polish text and group related words together for tasks like search, information retrieval, or linguistic analysis.

Not ideal if you primarily work with languages other than Polish, or if you need to compile your own custom stemming tables from scratch.

Polish-language-processing text-analysis natural-language-processing information-retrieval linguistics

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 7 / 25

Maturity 16 / 25

Community 11 / 25

How are scores calculated?

Stars

Forks

Language

HTML

License

—

Higher-rated alternatives

hplt-project/sacremoses

Python port of Moses tokenizer, truecaser and normalizer

Blake-Madden/OleanderStemmingLibrary

Porter stemming library (C++)

adbar/simplemma

Simple multilingual lemmatizer for Python, especially useful for speed and efficiency

htaghizadeh/PersianStemmer-Python

PersianStemmer-Python

michmech/lemmatization-lists

Machine-readable lists of lemma-token pairs in 23 languages.

Explore NLP Tools

All categories Trending NLP directory Insights