igorbrigadir/stopwords
Default English stopword lists from many different sources
When analyzing text, certain common words (like 'the' or 'and') don't add much meaning. This project provides various lists of these 'stop words' from different sources like search engines, databases, and text analysis tools. It helps researchers, content strategists, or anyone working with text data to clean their input by identifying and removing these words, leading to more focused and relevant analysis.
313 stars. No commits in the last 6 months.
Use this if you need to clean English text data by filtering out common, uninformative words for tasks like search indexing, topic modeling, or sentiment analysis, and want to compare or use different established stop word lists.
Not ideal if you need stop word lists for languages other than English or if you require highly specialized, domain-specific stop words that aren't typically covered by general lists.
Stars
313
Forks
125
Language
Python
License
—
Category
Last pushed
Apr 06, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/igorbrigadir/stopwords"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Alir3z4/python-stop-words
Get list of common stop words in various languages in Python
hklemp/dotnet-stop-words
Get list of common stop words in various languages in dotnet
skupriienko/Ukrainian-Stopwords
the list of ~2000 ukrainian stopwords (with numbers)
stdlib-js/datasets-savoy-stopwords-fr
A list of French stop words.
eklem/stopword-trainer
A module for creating stopword lists for any language, based on a set of documents.