MuhammadYaseenKhan/Urdu-Sentiment-Corpus
Labelled Dataset for Urdu Sentiment Analysis
This project provides a collection of Urdu text, primarily tweets, that have been manually labeled to indicate whether the sentiment expressed is positive, negative, or neutral. It helps researchers and linguists who are building or evaluating systems that can automatically detect the emotional tone within Urdu language content. The input is raw Urdu text, and the output is the same text with an associated sentiment label.
No commits in the last 6 months.
Use this if you are developing or testing algorithms to automatically understand sentiment in Urdu written content, especially social media posts.
Not ideal if you need a dataset for tasks other than sentiment analysis, such as topic modeling or language translation.
Stars
9
Forks
19
Language
—
License
—
Category
Last pushed
Dec 06, 2020
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/MuhammadYaseenKhan/Urdu-Sentiment-Corpus"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
acl-org/acl-anthology
Data and software for building the ACL Anthology.
anoopkunchukuttan/indic_nlp_library
Resources and tools for Indian language Natural Language Processing
CLUEbenchmark/CLUECorpus2020
Large-scale Pre-training Corpus for Chinese 100G 中文预训练语料
KennethEnevoldsen/scandinavian-embedding-benchmark
A Scandinavian Benchmark for sentence embeddings
Separius/awesome-sentence-embedding
A curated list of pretrained sentence and word embedding models