jasonwei20/eda_nlp

Data augmentation for NLP, presented at EMNLP 2019

42
/ 100
Emerging

This tool helps improve the accuracy of text classification models, especially when you have a small dataset. It takes your existing labeled text data and generates new, subtly varied sentences, effectively expanding your training set. This is ideal for machine learning engineers, data scientists, or researchers who are building models to categorize text.

1,651 stars. No commits in the last 6 months.

Use this if you are working on a text classification project and your model's performance is limited by the amount of available training data.

Not ideal if you already have a very large text dataset or if you require highly specialized, domain-specific augmentation beyond simple word edits.

text-classification natural-language-processing machine-learning-training data-scarcity model-performance
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 8 / 25
Community 24 / 25

How are scores calculated?

Stars

1,651

Forks

313

Language

Python

License

Last pushed

Mar 19, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/jasonwei20/eda_nlp"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.