Data-Centric-AI-Community/ydata-quality

Data Quality assessment with one line of code

54
/ 100
Established

This tool helps data professionals quickly check the quality of their datasets before using them for analysis or machine learning. You provide your raw or transformed dataset, and it automatically flags issues like duplicate entries, highly correlated features, missing values, or erroneous data. It's designed for data scientists, machine learning engineers, and data analysts who need to ensure data reliability.

454 stars.

Use this if you need a quick, comprehensive overview of potential quality problems in your tabular datasets that could impact downstream analysis or model performance.

Not ideal if you need to build custom, complex data validation rules that go beyond standard quality checks, or if you're not comfortable working with Python.

data-quality-assurance data-preparation machine-learning-engineering data-analysis data-governance
No Package No Dependents
Maintenance 10 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 18 / 25

How are scores calculated?

Stars

454

Forks

56

Language

Jupyter Notebook

License

MIT

Last pushed

Mar 02, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/Data-Centric-AI-Community/ydata-quality"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.