imbalanced-learn and machine-learning-imbalanced-data

The first is a mature, production-ready resampling and algorithmic library for handling class imbalance, while the second is an educational repository teaching imbalance techniques using that library as a dependency—making them complementary rather than competitive.

Maintenance 13/25
Adoption 15/25
Maturity 25/25
Community 24/25
Maintenance 0/25
Adoption 10/25
Maturity 16/25
Community 25/25
Stars: 7,090
Forks: 1,328
Downloads:
Commits (30d): 1
Language: Python
License: MIT
Stars: 188
Forks: 223
Downloads:
Commits (30d): 0
Language: Jupyter Notebook
License:
No risk flags
Stale 6m No Package No Dependents

About imbalanced-learn

scikit-learn-contrib/imbalanced-learn

A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning

This tool helps data scientists and machine learning engineers build more accurate predictive models when their datasets have unequal numbers of examples across different categories. It takes a raw, imbalanced dataset and processes it using various re-sampling techniques to create a more balanced dataset, which then leads to improved model performance, especially for the under-represented categories. This is particularly useful for tasks where correctly identifying rare events is critical.

predictive-modeling data-preprocessing imbalanced-data-classification fraud-detection medical-diagnosis

About machine-learning-imbalanced-data

solegalli/machine-learning-imbalanced-data

Code repository for the online course Machine Learning with Imbalanced Data

When building machine learning models, especially for rare events like fraud detection or disease diagnosis, you often encounter imbalanced datasets where one outcome is far less common. This project helps you address this by providing techniques to balance your data, leading to more accurate and reliable predictions. Data scientists and machine learning engineers will find this useful for improving their model performance.

data-science machine-learning-engineering predictive-modeling fraud-detection medical-diagnosis

Scores updated daily from GitHub, PyPI, and npm data. How scores work