lancopku/text-autoaugment

[EMNLP 2021] Text AutoAugment: Learning Compositional Augmentation Policy for Text Classification

/ 100

Emerging

This project helps data scientists and machine learning engineers enhance text classification models, especially when working with limited data. It takes your existing text dataset (like customer reviews or survey responses) and automatically generates diverse, high-quality augmented text samples. The output is an expanded training dataset that can be used to improve the performance and generalization of deep learning models like BERT for text classification tasks.

130 stars. No commits in the last 6 months.

Use this if you are building text classification models and struggle with low data availability or class imbalance, and want to improve model accuracy by automatically generating more training examples.

Not ideal if your primary goal is not text classification, or if you already have a very large and well-balanced text dataset for training.

text-classification natural-language-processing machine-learning-engineering data-augmentation deep-learning-optimization

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 14 / 25

How are scores calculated?

Stars

130

Forks

Language

Python

License

MIT

Higher-rated alternatives

dsfsi/textaugment

TextAugment: Text Augmentation Library

425776024/nlpcda

一键中文数据增强包； NLP数据增强、bert数据增强、EDA：pip install nlpcda

google-research/uda

Unsupervised Data Augmentation (UDA)

searchableai/KitanaQA

KitanaQA: Adversarial training and data augmentation for neural question-answering models

SanghunYun/UDA_pytorch

UDA(Unsupervised Data Augmentation) implemented by pytorch

Explore NLP Tools

All categories Trending NLP directory Insights