Automunge/AutoMunge

Tabular feature encoding pipelines for machine learning with options for string parsing, missing data infill, and stochastic perturbations.

61
/ 100
Established

This tool helps data scientists, machine learning engineers, and analysts prepare raw tabular data for machine learning models. It automatically handles tasks like converting text into numerical formats, filling in missing data, and normalizing numbers. You input your 'messy' spreadsheet-like data, and it outputs cleaned, structured data ready for model training, along with a 'recipe' to process future similar data consistently.

164 stars. Available on PyPI.

Use this if you need to quickly and consistently clean and transform your tabular datasets for machine learning, especially if you have a mix of numerical, categorical, and text data with missing entries.

Not ideal if your data is not tabular (e.g., images, audio, pure text documents) or if you prefer to manually control every step of your data transformation process.

data-preprocessing feature-engineering machine-learning-preparation data-cleaning tabular-data-transformation
Maintenance 10 / 25
Adoption 10 / 25
Maturity 25 / 25
Community 16 / 25

How are scores calculated?

Stars

164

Forks

22

Language

Jupyter Notebook

License

BSD-3-Clause

Last pushed

Mar 08, 2026

Commits (30d)

0

Dependencies

4

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/Automunge/AutoMunge"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.