several27/FakeNewsCorpus

A dataset of millions of news articles scraped from a curated list of data sources.

50
/ 100
Established

This dataset offers millions of news articles from various sources, categorized by type (like fake news, satire, or credible). It provides raw content, titles, authors, and other metadata, allowing you to feed this information into a system for automated analysis. Data scientists, researchers, or anyone building tools for content verification would use this.

413 stars. No commits in the last 6 months.

Use this if you need a large, pre-labeled corpus of news articles to train machine learning models for identifying different types of news, particularly for 'fake news' detection.

Not ideal if you need a constantly updated news dataset for real-time analysis, as this dataset is not planned for continuous updates.

news-analysis media-monitoring content-verification fact-checking misinformation-research
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 24 / 25

How are scores calculated?

Stars

413

Forks

98

Language

License

Apache-2.0

Last pushed

Jan 25, 2020

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/several27/FakeNewsCorpus"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.