Vietnamese NLP Tools

Comprehensive NLP resources, toolkits, and datasets specifically for Vietnamese language processing tasks. Includes Vietnamese-specific tools, corpora, and task-specific models. Does NOT include general multilingual NLP tools, language-agnostic frameworks, or non-Vietnamese language resources.

There are 53 vietnamese nlp tools tracked. 4 score above 50 (established tier). The highest-rated is vunb/vntk at 58/100 with 218 stars.

Get all 53 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=vietnamese-nlp-tools&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 vunb/vntk

Vietnamese NLP Toolkit for Node

58
Established
2 vncorenlp/VnCoreNLP

A Vietnamese natural language processing toolkit (NAACL 2018)

51
Established
3 VinAIResearch/PhoNLP

PhoNLP: A BERT-based multi-task learning model for part-of-speech tagging,...

51
Established
4 IBM/transition-amr-parser

SoTA Abstract Meaning Representation (AMR) parsing with word-node alignments...

50
Established
5 nert-nlp/AMR-Bibliography

Organized inventory of research using the Abstract Meaning Representation

47
Emerging
6 sheng-z/stog

AMR Parsing as Sequence-to-Graph Transduction

47
Emerging
7 duyvuleo/VNTC

A Large-scale Vietnamese News Text Classification Corpus

47
Emerging
8 anhthuan1999/Vietnamese-News-Classification

We use LSTM, BiLSTM, BERT and SVM with TF-IDF, Word2vec and Bag-of-words to...

41
Emerging
9 undertheseanlp/NLP-Vietnamese-progress

Repository to track the progress in Vietnamese Natural Language Processing,...

41
Emerging
10 henryle97/Spelling_Correction_Vietnamese

Vietnamese spelling error correction with Seq2Seq model

40
Emerging
11 Nguyendat-bit/VieTokenizer

Vietnamese Tokenizer package based on deeplearning methods

39
Emerging
12 vTuanpham/Vietnamese_QA_System

Vietnamese long form question answering system with documents retrieval.

39
Emerging
13 undertheseanlp/chatbot

Vietnamese Chatbot

39
Emerging
14 mailong25/bert-vietnamese-question-answering

Vietnamese question answering system with BERT

39
Emerging
15 WhySchools/VMDS-vietnamese-misspell-dataset-from-Social-media

Vietnamese Misspell Dataset - Tập dữ liệu chính tả tiếng Việt trên mạng xã hội

37
Emerging
16 VinAIResearch/PhoNER_COVID19

COVID-19 Named Entity Recognition for Vietnamese (NAACL 2021)

36
Emerging
17 undertheseanlp/word_tokenize

Vietnamese Word Tokenize

36
Emerging
18 plandes/amr

AMR annotation and feature generation

36
Emerging
19 bmd1905/vietnamese-correction

A project improves the quality and accuracy of the Vietnamese language.

36
Emerging
20 phongnt570/UETsegmenter

A toolkit for Vietnamese word segmentation

34
Emerging
21 duongntbk/restore_vietnamese_diacritics

A Transformer based NLP solution to restore diacritics for Vietnamese text...

33
Emerging
22 tienthanhdhcn/Vietnamese-Accent-Prediction

A simple/fast/accurate accent prediction for non-accented Vietnamese text

32
Emerging
23 datnnt1997/bert_vn_ner

PyTorch solution of Vietnamese Named Entity Recognition task with Google...

32
Emerging
24 nschneid/amr-tutorial

Abstract Meaning Representation (AMR) tutorial slides

32
Emerging
25 VietHoang1512/vietnamese-spell-correct-and-text-classify

A spell corrector and text classifier using Deep Neural Network

32
Emerging
26 tkhangg0910/ViConBERT

Official Codebase for ViConBERT: Context-Gloss Aligned Vietnamese Word...

32
Emerging
27 PB3002/ViMedical_Disease

A Vietnamese dataset of over 12 thousands questions about common disease...

31
Emerging
28 ds4v/vietnamese-pos-tagging

Gán nhãn từ loại Tiếng Việt sử dụng mô hình Hidden Markov kết hợp thuật toán Viterbi

30
Emerging
29 pbcquoc/vietnamese_word_seperate

Seperate vietnamese using lstm

30
Emerging
30 wonrax/phobert-base-vietnamese-sentiment

PhoBERT fine-tuned for sentiment analysis

30
Emerging
31 dangvansam/phobert-text-classification

Phân loại văn bản Tiếng Việt sử dụng pretrained model - PhoBERT

29
Experimental
32 Avi197/Phobert-Named-Entity-Reconigtion

Applied Phobert model by VinAI research for Vietnamese NER task on various dataset

29
Experimental
33 209sontung/Vietnamese-stock-article-classification

Sentiment-based classification for stock article title using PhoBert

29
Experimental
34 chanind/penman-js

Abstract Meaning Representation (AMR) parser and generator for Javascript

28
Experimental
35 telexyz/data

Tổng hợp ngữ liệu tiếng Việt

27
Experimental
36 tien02/ensemble-roberta-fasttext-vietnamese

Ensemble PhoBERT with FastText Embedding to improve performance on...

27
Experimental
37 AnhHoang0529/Small-LexNormViHSD

A Dataset for Vietnamese Lexical Normalization

24
Experimental
38 NamSyntax/Vietnamese-QA

Vietnamese-QA is very simple with XLM-RoBERTa fine-tuned on the Vietnamese...

24
Experimental
39 xndien2004/ViAMR

[VLSP 2025] ViAMR: Fine-tuning LLMs for Abstract Meaning Representation in Vietnamese

24
Experimental
40 phkhanhtrinh23/vietnamese_ner_bert

Vietnamese Named-Entity Recognition.

22
Experimental
41 bug-breeder/vant

AI-powered Vietnamese Input Method for macOS — Rust core +...

22
Experimental
42 v-bible/crawler

A collection of web crawlers to crawl Catholic resources in Vietnamese language

21
Experimental
43 nicolay-r/ViLongT5

LongT5-based model pre-trained on a large amount of unlabeled Vietnamese...

20
Experimental
44 kh4nh12/ViTASA

A novel dataset and method for Vietnamese Target-Aspect-Sentiment joint...

20
Experimental
45 vietbtx/ViTextnormASR

Our source code for the paper "Transformer-based Joint Learning Approach for...

19
Experimental
46 ngxtnhi/ViLexNorm

A Lexical Normalization Corpus for Vietnamese Social Media Text

19
Experimental
47 dinhanhx/vcc

Vietnamese Conceptual Caption

18
Experimental
48 manhtt-079/vipubmed-deberta

ViPubmedDeBERTa: A Pre-trained Model for Vietnamese Biomedical Text (PACLIC 2023)

18
Experimental
49 ndthuan/vi-word-segmenter

HTTP wrapper of the VnCoreNLP library - A Vietnamese natural language...

17
Experimental
50 JulienBez/ASMR

This is ASMR, an empirical, alignment-based algorithm used to identify and...

15
Experimental
51 longday1102/Demo-QA-Extraction-system

⚡ The system extracts answers from a given context

12
Experimental
52 Vinfall/CnGal-to-VNDB

A naïve tool to detect missing CnGal releases on VNDB

11
Experimental
53 vanhai1231/Vietnamese-news-classifier

Phân loại chủ đề tin tức tiếng Việt bằng Logistic Regression và TF-IDF. Dự...

10
Experimental