SJTU-DMTai/awesome-ml-data-quality-papers

Papers about training data quality management for ML models.

/ 100

Experimental

This resource provides a curated list of research papers focused on improving the quality of training data for machine learning models. It helps data scientists understand and implement strategies to refine their datasets, ultimately leading to more robust and reliable AI systems. You'll find research on identifying problematic data, assessing its impact, and techniques for data selection and debugging to enhance model performance, fairness, and robustness.

112 stars. No commits in the last 6 months.

Use this if you are a data scientist regularly building and deploying machine learning models and frequently encounter issues with model performance or unexpected behavior due to the quality of your training data.

Not ideal if you are looking for ready-to-use software tools or libraries for immediate implementation rather than academic research and theoretical foundations.

Machine Learning Engineering Data Science ML Model Debugging Data Quality Management AI System Development

No License Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 9 / 25

Maturity 8 / 25

Community 9 / 25

How are scores calculated?

Stars

112

Forks

Language

—

License

—

Higher-rated alternatives

voxel51/fiftyone

Refine high-quality datasets and visual AI models

academic/awesome-datascience

:memo: An awesome Data Science repository to learn and apply for real world problems.

sacridini/Awesome-Geospatial

Long list of geospatial tools and resources

r0f1/datascience

Curated list of Python resources for data science.

neomatrix369/awesome-ai-ml-dl

Awesome Artificial Intelligence, Machine Learning and Deep Learning as we learn it. Study notes...

Explore ML Frameworks

All categories Trending ML Framework directory Insights