quincyliang/nlp-data-augmentation
Data Augmentation for NLP. NLP数据增强
When working with text data for AI models, you often don't have enough examples to train effectively. This project helps you create more varied text samples from your existing data using techniques like synonym replacement, word shuffling, and translation. It's for data scientists, machine learning engineers, and NLP practitioners who need to expand their datasets to build more robust language models.
294 stars. No commits in the last 6 months.
Use this if you have a limited text dataset and need to generate more training examples to improve your natural language processing models.
Not ideal if you're looking for tools to collect entirely new, original text data rather than augmenting existing samples.
Stars
294
Forks
41
Language
—
License
—
Category
Last pushed
Dec 10, 2020
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/quincyliang/nlp-data-augmentation"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Higher-rated alternatives
dsfsi/textaugment
TextAugment: Text Augmentation Library
425776024/nlpcda
一键中文数据增强包 ; NLP数据增强、bert数据增强、EDA:pip install nlpcda
google-research/uda
Unsupervised Data Augmentation (UDA)
searchableai/KitanaQA
KitanaQA: Adversarial training and data augmentation for neural question-answering models
SanghunYun/UDA_pytorch
UDA(Unsupervised Data Augmentation) implemented by pytorch