vikasing/news-stopwords
A huge list of stopwords collected from millions of news articles
This helps news analysts, researchers, or content strategists identify and filter out common, less meaningful words and phrases in large collections of news articles. It provides lists of frequently occurring terms from millions of news articles, allowing you to quickly focus on the unique and significant content. You would use this to refine your analysis of news text data.
No commits in the last 6 months.
Use this if you need to clean and prepare large news datasets for text analysis by removing highly common but semantically uninformative words.
Not ideal if your analysis requires retaining all words, including common ones, or if you are working with text from domains other than news.
Stars
14
Forks
5
Language
—
License
—
Category
Last pushed
Jun 21, 2017
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/vikasing/news-stopwords"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Alir3z4/python-stop-words
Get list of common stop words in various languages in Python
hklemp/dotnet-stop-words
Get list of common stop words in various languages in dotnet
igorbrigadir/stopwords
Default English stopword lists from many different sources
skupriienko/Ukrainian-Stopwords
the list of ~2000 ukrainian stopwords (with numbers)
stdlib-js/datasets-savoy-stopwords-fr
A list of French stop words.