kotartemiy/topic-labeled-news-dataset
100k+ topic labeled news articles published from thousands of news websites
This dataset provides over 100,000 news articles, each pre-categorized into one of eight common topics like Business, Entertainment, or Technology. It's a collection of news content published during the first half of August 2020 from thousands of sources, ready for use in research or analysis. Researchers, content strategists, or anyone analyzing news trends would find this valuable.
No commits in the last 6 months.
Use this if you need a large, pre-categorized collection of news articles to study news trends, build content classification models, or conduct media analysis.
Not ideal if you require current news data, real-time article streams, or an extremely diverse set of topics beyond the eight provided.
Stars
19
Forks
2
Language
—
License
MIT
Category
Last pushed
Aug 18, 2020
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/kotartemiy/topic-labeled-news-dataset"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
MIND-Lab/OCTIS
OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models...
i-dot-ai/themefinder
A topic modelling Python package for analysing one-to-many question-answer data.
andifunke/topic-labeling
The project proposes a framework to apply topic models on a text-corpus and eventually topic...
bab2min/tomotopy
Python package of Tomoto, the Topic Modeling Tool
bobxwu/TopMost
A Topic Modeling System Toolkit (ACL 2024 Demo)