Data-Centric-AI-Community/ydata-profiling

1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.

58
/ 100
Established

This tool helps data analysts and data scientists quickly understand the quality and characteristics of their datasets. You provide your raw data, and it generates a comprehensive report detailing statistics, potential issues like missing values or skewness, and even analyzes time-series and text data. It's ideal for anyone who needs to perform an initial deep dive into a new dataset or monitor changes in existing ones.

13,427 stars.

Use this if you need to perform quick, extensive exploratory data analysis and generate a shareable report on your dataset's health and composition.

Not ideal if you primarily need to connect to diverse database systems for profiling without first loading data into a DataFrame, or require advanced, guided data governance features.

data-quality exploratory-data-analysis data-auditing time-series-analysis dataset-comparison
No Package No Dependents
Maintenance 10 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 22 / 25

How are scores calculated?

Stars

13,427

Forks

1,766

Language

Python

License

MIT

Last pushed

Mar 03, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/data-engineering/Data-Centric-AI-Community/ydata-profiling"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.