quincyliang/nlp-data-augmentation

Data Augmentation for NLP. NLP数据增强

/ 100

Emerging

When working with text data for AI models, you often don't have enough examples to train effectively. This project helps you create more varied text samples from your existing data using techniques like synonym replacement, word shuffling, and translation. It's for data scientists, machine learning engineers, and NLP practitioners who need to expand their datasets to build more robust language models.

294 stars. No commits in the last 6 months.

Use this if you have a limited text dataset and need to generate more training examples to improve your natural language processing models.

Not ideal if you're looking for tools to collect entirely new, original text data rather than augmenting existing samples.

text analytics machine learning datasets natural language processing AI model training data enrichment

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 8 / 25

Community 18 / 25

How are scores calculated?

Stars

294

Forks

Language

—

License

—

Compare

nlp-data-augmentation and nlpcda

Higher-rated alternatives

dsfsi/textaugment

TextAugment: Text Augmentation Library

425776024/nlpcda

一键中文数据增强包； NLP数据增强、bert数据增强、EDA：pip install nlpcda

google-research/uda

Unsupervised Data Augmentation (UDA)

searchableai/KitanaQA

KitanaQA: Adversarial training and data augmentation for neural question-answering models

SanghunYun/UDA_pytorch

UDA(Unsupervised Data Augmentation) implemented by pytorch

Explore NLP Tools

All categories Trending NLP directory Insights