duyvuleo/VNTC
A Large-scale Vietnamese News Text Classification Corpus
This is a comprehensive collection of Vietnamese news articles, carefully categorized by topic. It provides a ready-to-use dataset for anyone needing to sort or analyze Vietnamese news content. Researchers, data scientists, or analysts working with Vietnamese language data can use this to train and evaluate systems for automatically classifying news articles.
108 stars. No commits in the last 6 months.
Use this if you need a pre-categorized, large-scale dataset of Vietnamese news to develop or test automated news classification systems.
Not ideal if you are looking for a dataset of general Vietnamese text that isn't specifically news-based or already categorized.
Stars
108
Forks
58
Language
—
License
MIT
Category
Last pushed
Sep 24, 2019
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/duyvuleo/VNTC"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
vunb/vntk
Vietnamese NLP Toolkit for Node
vncorenlp/VnCoreNLP
A Vietnamese natural language processing toolkit (NAACL 2018)
VinAIResearch/PhoNLP
PhoNLP: A BERT-based multi-task learning model for part-of-speech tagging, named entity...
IBM/transition-amr-parser
SoTA Abstract Meaning Representation (AMR) parsing with word-node alignments in Pytorch....
nert-nlp/AMR-Bibliography
Organized inventory of research using the Abstract Meaning Representation