trannguyenhan/preprocessing-data
Tiền xử lý dữ liệu tiếng Việt với 4 bước
This tool helps Vietnamese content creators, marketers, or researchers prepare raw Vietnamese text for analysis. It takes messy, unstandardized Vietnamese text as input and outputs clean, consistently formatted text ready for further processing like text mining or classification. This is ideal for anyone working with large volumes of user-generated content or articles in Vietnamese.
No commits in the last 6 months.
Use this if you need to standardize and clean Vietnamese text data that might contain inconsistent formatting, Unicode errors, or incorrect capitalization.
Not ideal if your data is not in Vietnamese or if you require advanced natural language processing tasks beyond basic text cleaning.
Stars
14
Forks
6
Language
Jupyter Notebook
License
—
Category
Last pushed
Aug 24, 2021
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/trannguyenhan/preprocessing-data"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
castorini/hedwig
PyTorch deep learning models for document classification
kk7nc/Text_Classification
Text Classification Algorithms: A Survey
AnubhavGupta3377/Text-Classification-Models-Pytorch
Implementation of State-of-the-art Text Classification Models in Pytorch
inspirehep/magpie
Deep neural network framework for multi-label text classification
InseeFrLab/torchTextClassifiers
A unified framework for text classification in PyTorch.