Livingston-k/cleanPyData
cleanPyData is a Python package for data cleaning and preprocessing. It handles missing values, normalizes data, extracts features, and detects outliers, making your data ready for analysis or machine learning.
When preparing data for analysis or machine learning, you often encounter messy datasets with gaps, inconsistencies, or unusual entries. This tool helps you transform raw, incomplete data into a clean, standardized format ready for modeling. It takes your unrefined tabular data and outputs a polished dataset, making it ideal for data scientists, analysts, and machine learning engineers.
No commits in the last 6 months. Available on PyPI.
Use this if you need to quickly and systematically clean, normalize, and refine your tabular data before using it for predictive modeling or insightful reports.
Not ideal if your primary need is complex feature engineering for unstructured data like text or images, or if you're looking for advanced statistical modeling capabilities.
Stars
8
Forks
3
Language
Python
License
MIT
Category
Last pushed
May 25, 2024
Commits (30d)
0
Dependencies
2
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/Livingston-k/cleanPyData"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
skrub-data/skrub
Machine learning with dataframes
biolab/orange3
🍊 :bar_chart: :bulb: Orange: Interactive data analysis
root-project/root
The official repository for ROOT: analyzing, storing and visualizing big data, scientifically
cleanlab/cleanlab
Cleanlab's open-source library is the standard data-centric AI package for data quality and...
drivendataorg/deon
A command line tool to easily add an ethics checklist to your data science projects.