kmkurn/id-nlp-resource

A list of Indonesian NLP resources.

38
/ 100
Emerging

This is a curated list of publicly available language data for Indonesian, including vast collections of news articles, social media posts, and transcribed speech. It serves as a central hub for anyone needing Indonesian text or audio to train or evaluate language models, analyze sentiment, or build translation systems. Researchers, data scientists, and language technology developers focused on the Indonesian market would find this resource invaluable.

290 stars. No commits in the last 6 months.

Use this if you need pre-existing Indonesian text or speech datasets for developing or evaluating language-related applications and research.

Not ideal if you need a tool to process Indonesian text or speech directly, as this resource only provides the raw data.

natural-language-processing machine-translation speech-recognition sentiment-analysis corpus-linguistics
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 8 / 25
Community 20 / 25

How are scores calculated?

Stars

290

Forks

48

Language

License

Last pushed

Jan 18, 2022

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/kmkurn/id-nlp-resource"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.