Automunge/AutoMunge

Tabular feature encoding pipelines for machine learning with options for string parsing, missing data infill, and stochastic perturbations.

/ 100

Established

This tool helps data scientists, machine learning engineers, and analysts prepare raw tabular data for machine learning models. It automatically handles tasks like converting text into numerical formats, filling in missing data, and normalizing numbers. You input your 'messy' spreadsheet-like data, and it outputs cleaned, structured data ready for model training, along with a 'recipe' to process future similar data consistently.

164 stars. Available on PyPI.

Use this if you need to quickly and consistently clean and transform your tabular datasets for machine learning, especially if you have a mix of numerical, categorical, and text data with missing entries.

Not ideal if your data is not tabular (e.g., images, audio, pure text documents) or if you prefer to manually control every step of your data transformation process.

data-preprocessing feature-engineering machine-learning-preparation data-cleaning tabular-data-transformation

Maintenance 10 / 25

Adoption 10 / 25

Maturity 25 / 25

Community 16 / 25

How are scores calculated?

Stars

164

Forks

Language

Jupyter Notebook

License

BSD-3-Clause

Related frameworks

process-intelligence-solutions/pm4py

Official public repository for PM4Py (Process Mining for Python) — an open-source library for...

autogluon/autogluon

Fast and Accurate ML in 3 Lines of Code

microsoft/FLAML

A fast library for AutoML and tuning. Join our Discord: https://discord.gg/Cppx2vSPVP.

shankarpandala/lazypredict

Lazy Predict help build a lot of basic models without much code and helps understand which...

aimclub/FEDOT

Automated modeling and machine learning framework FEDOT

Explore ML Frameworks

All categories Trending ML Framework directory Insights