several27/FakeNewsCorpus

A dataset of millions of news articles scraped from a curated list of data sources.

/ 100

Established

This dataset offers millions of news articles from various sources, categorized by type (like fake news, satire, or credible). It provides raw content, titles, authors, and other metadata, allowing you to feed this information into a system for automated analysis. Data scientists, researchers, or anyone building tools for content verification would use this.

413 stars. No commits in the last 6 months.

Use this if you need a large, pre-labeled corpus of news articles to train machine learning models for identifying different types of news, particularly for 'fake news' detection.

Not ideal if you need a constantly updated news dataset for real-time analysis, as this dataset is not planned for continuous updates.

news-analysis media-monitoring content-verification fact-checking misinformation-research

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 24 / 25

How are scores calculated?

Stars

413

Forks

Language

—

License

Apache-2.0

Related tools

openfactcheck-research/openfactcheck

An Open-source Factuality Evaluation Demo for LLMs

lilakk/BooookScore

A package to generate summaries of long-form text and evaluate the coherence of these summaries....

Cartus/Automated-Fact-Checking-Resources

Links to conference/journal publications in automated fact-checking (resources for the...

armingh2000/FactScoreLite

FactScoreLite is an implementation of the FactScore metric, designed for detailed accuracy...

manideep2510/siamese-BERT-fake-news-detection-LIAR

Triple Branch BERT Siamese Network for fake news classification on LIAR-PLUS dataset in PyTorch

Explore NLP Tools

All categories Trending NLP directory Insights