Data Augmentation NLP NLP Tools
Tools and frameworks for generating synthetic training data, augmenting existing datasets, and applying transformation techniques to improve NLP model performance. Does NOT include general data preprocessing, cleaning, or annotation tools.
There are 33 data augmentation nlp tools tracked. 2 score above 50 (established tier). The highest-rated is dsfsi/textaugment at 64/100 with 433 stars.
Get all 33 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=data-augmentation-nlp&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
dsfsi/textaugment
TextAugment: Text Augmentation Library |
|
Established |
| 2 |
425776024/nlpcda
一键中文数据增强包 ; NLP数据增强、bert数据增强、EDA:pip install nlpcda |
|
Established |
| 3 |
google-research/uda
Unsupervised Data Augmentation (UDA) |
|
Emerging |
| 4 |
searchableai/KitanaQA
KitanaQA: Adversarial training and data augmentation for neural... |
|
Emerging |
| 5 |
SanghunYun/UDA_pytorch
UDA(Unsupervised Data Augmentation) implemented by pytorch |
|
Emerging |
| 6 |
KennethEnevoldsen/augmenty
Augmenty is an augmentation library based on spaCy for augmenting texts. |
|
Emerging |
| 7 |
toriving/KoEDA
Korean Easy Data Augmentation |
|
Emerging |
| 8 |
AlexKay28/zarnitsa
:cloud_with_lightning: Zarnitsa package for data augmentation ops |
|
Emerging |
| 9 |
zhanlaoban/EDA_NLP_for_Chinese
An implement of the paper of EDA for Chinese corpus.中文语料的EDA数据增强工具。NLP数据增强。论文阅读笔记。 |
|
Emerging |
| 10 |
lancopku/text-autoaugment
[EMNLP 2021] Text AutoAugment: Learning Compositional Augmentation Policy... |
|
Emerging |
| 11 |
quincyliang/nlp-data-augmentation
Data Augmentation for NLP. NLP数据增强 |
|
Emerging |
| 12 |
patrick-batman/Unsupervised-Hypothesis-Creation
unsupervised creation of contradictory, entailing sentences from a given... |
|
Emerging |
| 13 |
chck/AugLy-jp
Data Augmentation for Japanese Text on AugLy |
|
Experimental |
| 14 |
k4black/fast-aug
Fast Augmentation library for NLP |
|
Experimental |
| 15 |
zhaominyiz/EPiDA
Official Code for 'EPiDA: An Easy Plug-in Data Augmentation Framework for... |
|
Experimental |
| 16 |
remydecoupes/GeoNLPlify
:earth_africa: :book: A NLP library for data augmentation focusing on... |
|
Experimental |
| 17 |
kajyuuen/daaja
This repository has implementations of data augmentation for NLP for Japanese. |
|
Experimental |
| 18 |
ChetanMJ/NL2SQL-Data-Augmentation
Data augmentation techniques help improve performance by generating data of... |
|
Experimental |
| 19 |
pemagrg1/nlp-data-augmentation
Augmentating Textual Data Using NLP Libraries. |
|
Experimental |
| 20 |
Ritvik19/Text-Data-Augmentation
State of the Art Text Data Augmentation for Natural Language Processing Applications |
|
Experimental |
| 21 |
aryashah2k/NLP-Data-Augmentation
Implementing 5 Different Approaches To Augmenting Data For Natural Language... |
|
Experimental |
| 22 |
masoudMZB/Text-Wizard-Fatsapi-NLP-project
NLP Visualization/Augmentation techniques using fast api to implement. |
|
Experimental |
| 23 |
dheeraj7596/CONDA
Generate synthetic training data using small LMs. |
|
Experimental |
| 24 |
sminerport/TextAugmentor
This repo offers a Python script using NLPAug library & RTT to augment text... |
|
Experimental |
| 25 |
ClaudiaShu/UNA
This is the official code of our Paper "Unsupervised hard Negative... |
|
Experimental |
| 26 |
aflah02/NLP-Albumentations-Data-Augmentation
This repository contains helper functions which can help you generate... |
|
Experimental |
| 27 |
EsratMaria/Data_Augmentation_with_NLP
Few NLP augmentation techniques: synonym/antonym and back translation. |
|
Experimental |
| 28 |
dextergui/NLarge
NLarge - Dataset Augmentation Tool |
|
Experimental |
| 29 |
sounritesh/CIAug-NAACL
Official PyTorch implementation of CIAug: Equipping Interpolative... |
|
Experimental |
| 30 |
Shreyasi2002/Legal-Augmentation
Analyzing various data augmentation techniques for enrichment of legal text... |
|
Experimental |
| 31 |
jaaack-wang/linguistic-knowledge-in-DA-for-NLP
Source Code, data, and results for my paper titled Linguistic Knowledge in... |
|
Experimental |
| 32 |
xmy0916/nlp_aug
Chinese NLP Data Augmentation, for simplicity!!! |
|
Experimental |
| 33 |
FairNLP/perturbers
Low-code neural data augmentation for fairness |
|
Experimental |