wongnai/wongnai-corpus
Collection of Wongnai's datasets
This collection offers Thai language datasets primarily for natural language processing research. It includes search query words, some algorithmically and some human-labeled, along with a food dictionary, as well as restaurant reviews with star ratings. The datasets help researchers and data scientists build and evaluate models for tasks like word segmentation and review rating prediction.
No commits in the last 6 months.
Use this if you are developing or researching natural language processing models specifically for the Thai language, especially for tasks related to search query understanding or sentiment analysis of reviews.
Not ideal if your project does not involve the Thai language or if you need general-purpose text data unrelated to food, restaurants, or search queries.
Stars
79
Forks
23
Language
—
License
LGPL-3.0
Category
Last pushed
Aug 26, 2019
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/wongnai/wongnai-corpus"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
malaysia-ai/malaya
Natural Language Toolkit for Malaysian language, https://malaya.readthedocs.io/
IndoNLP/indonlu
The first-ever vast natural language processing benchmark for Indonesian Language. We provide...
louisowen6/NLP_bahasa_resources
A Curated List of Dataset and Usable Library Resources for NLP in Bahasa Indonesia
kirralabs/indonesian-NLP-resources
data resource untuk NLP bahasa indonesia
rizalespe/Dataset-Sentimen-Analisis-Bahasa-Indonesia
Repositori ini merupakan kumpulan dataset terkait analisis sentimen Berbahasa Indonesia. Apabila...