IndoNLP/nusax
High-quality parallel resource on sentiment analysis for 10 low-resource Indonesian languages, English, and Indonesian (Outstanding Paper at EACL 2023)
This project provides high-quality text data for understanding opinions expressed in 10 Indonesian local languages, alongside Indonesian and English. It offers expertly translated sentiment datasets and parallel lexicons, enabling analysis of how people feel or what they think in these diverse languages. Market researchers, linguists, or social scientists focused on Indonesian regional contexts would find this valuable for sentiment analysis.
110 stars. No commits in the last 6 months.
Use this if you need to analyze sentiment or translate text involving Indonesian local languages like Javanese, Sundanese, or Balinese.
Not ideal if your focus is on sentiment analysis or translation for languages outside of the specified Indonesian local languages, Indonesian, or English.
Stars
110
Forks
10
Language
Jupyter Notebook
License
Apache-2.0
Category
Last pushed
May 08, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/IndoNLP/nusax"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
malaysia-ai/malaya
Natural Language Toolkit for Malaysian language, https://malaya.readthedocs.io/
IndoNLP/indonlu
The first-ever vast natural language processing benchmark for Indonesian Language. We provide...
louisowen6/NLP_bahasa_resources
A Curated List of Dataset and Usable Library Resources for NLP in Bahasa Indonesia
kirralabs/indonesian-NLP-resources
data resource untuk NLP bahasa indonesia
wongnai/wongnai-corpus
Collection of Wongnai's datasets