Livingston-k/cleanPyData

cleanPyData is a Python package for data cleaning and preprocessing. It handles missing values, normalizes data, extracts features, and detects outliers, making your data ready for analysis or machine learning.

/ 100

Emerging

When preparing data for analysis or machine learning, you often encounter messy datasets with gaps, inconsistencies, or unusual entries. This tool helps you transform raw, incomplete data into a clean, standardized format ready for modeling. It takes your unrefined tabular data and outputs a polished dataset, making it ideal for data scientists, analysts, and machine learning engineers.

No commits in the last 6 months. Available on PyPI.

Use this if you need to quickly and systematically clean, normalize, and refine your tabular data before using it for predictive modeling or insightful reports.

Not ideal if your primary need is complex feature engineering for unstructured data like text or images, or if you're looking for advanced statistical modeling capabilities.

data-preparation data-analysis machine-learning-prep data-quality dataset-refinement

Stale 6m

Maintenance 0 / 25

Adoption 4 / 25

Maturity 25 / 25

Community 14 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

skrub-data/skrub

Machine learning with dataframes

biolab/orange3

🍊 :bar_chart: :bulb: Orange: Interactive data analysis

root-project/root

The official repository for ROOT: analyzing, storing and visualizing big data, scientifically

cleanlab/cleanlab

Cleanlab's open-source library is the standard data-centric AI package for data quality and...

drivendataorg/deon

A command line tool to easily add an ethics checklist to your data science projects.

Explore ML Frameworks

All categories Trending ML Framework directory Insights