pradeepdev-1995/databalancer

Databalancer is the python library using in machine learning applications to balance the imbalanced text classification datasets before the model training.

29
/ 100
Experimental

This library helps machine learning practitioners prepare text data for classification models. It takes an imbalanced text dataset (e.g., a CSV file with text and categories) and generates new, synthetic text examples for under-represented categories. The output is a new, balanced dataset ready for model training, helping to improve model performance on all categories.

No commits in the last 6 months. Available on PyPI.

Use this if you are a machine learning engineer or data scientist working with text classification and your dataset has significantly fewer examples for some categories than others, leading to poor model performance on those rare categories.

Not ideal if your dataset is already well-balanced, or if you are not working with text classification problems.

text-classification dataset-balancing natural-language-processing machine-learning-engineering
Stale 6m No Dependents
Maintenance 0 / 25
Adoption 4 / 25
Maturity 25 / 25
Community 0 / 25

How are scores calculated?

Stars

7

Forks

Language

Python

License

Apache-2.0

Last pushed

Jul 08, 2022

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/pradeepdev-1995/databalancer"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.