kirralabs/indonesian-NLP-resources
data resource untuk NLP bahasa indonesia
This is a collection of Indonesian language text and word data. It provides various datasets, including sentences from news articles and web content, as well as detailed word lists categorized by type (like root words, verbs, nouns, slang, and positive/negative sentiment words). Language researchers, computational linguists, and data scientists working with Indonesian text will find this useful for training language models or analyzing text data.
230 stars. No commits in the last 6 months.
Use this if you need comprehensive datasets for developing or improving natural language processing applications specifically for the Indonesian language.
Not ideal if you are looking for ready-to-use NLP models or tools rather than raw linguistic data.
Stars
230
Forks
49
Language
—
License
MIT
Category
Last pushed
Sep 19, 2020
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/kirralabs/indonesian-NLP-resources"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Higher-rated alternatives
malaysia-ai/malaya
Natural Language Toolkit for Malaysian language, https://malaya.readthedocs.io/
IndoNLP/indonlu
The first-ever vast natural language processing benchmark for Indonesian Language. We provide...
louisowen6/NLP_bahasa_resources
A Curated List of Dataset and Usable Library Resources for NLP in Bahasa Indonesia
wongnai/wongnai-corpus
Collection of Wongnai's datasets
rizalespe/Dataset-Sentimen-Analisis-Bahasa-Indonesia
Repositori ini merupakan kumpulan dataset terkait analisis sentimen Berbahasa Indonesia. Apabila...