Brand24-AI/mms_benchmark

The most extensive open massively multilingual corpus of datasets for training sentiment models. The corpus consists of 79 manually selected from over 350 datasets reported in the scientific literature based on strict quality criteria and covers 27 languages.

/ 100

Experimental

This project provides a comprehensive collection of sentiment analysis datasets spanning 27 languages and multiple domains like product reviews, enabling you to train or fine-tune models that understand emotional tone in text. It takes raw text data as input and produces categorized sentiment (positive, neutral, negative) for various cultural contexts. This is ideal for data scientists, machine learning engineers, and researchers working on global applications that need to interpret customer feedback or social media mentions across different languages.

No commits in the last 6 months.

Use this if you need high-quality, pre-curated, multilingual sentiment datasets to build or improve your AI models, especially for nuanced, culture-dependent language tasks.

Not ideal if you're looking for a ready-to-use, out-of-the-box sentiment analysis tool rather than a dataset for model training.

sentiment-analysis multilingual-text natural-language-processing social-listening customer-feedback-analysis

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 6 / 25

Maturity 16 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Jupyter Notebook

License

—

Higher-rated alternatives

RISE-UNIBAS/humanities_data_benchmark

LLM Benchmark Suite for Humanities Data

ma-compbio/DNALONGBENCH

A benchmark suite of five genomics tasks for evaluating DNA foundation models on long-range dependencies.

wgyhhhh/EASE

About Official repository for "Towards Real-Time Fake News Detection under Evidence Scarcity"

TreeAI-Lab/NumericBench

A comprehensive benchmark to evaluate and improve the fundamental numerical reasoning abilities...

Explore NLP Tools

All categories Trending NLP directory Insights