SJTU-DMTai/awesome-ml-data-quality-papers
Papers about training data quality management for ML models.
This resource provides a curated list of research papers focused on improving the quality of training data for machine learning models. It helps data scientists understand and implement strategies to refine their datasets, ultimately leading to more robust and reliable AI systems. You'll find research on identifying problematic data, assessing its impact, and techniques for data selection and debugging to enhance model performance, fairness, and robustness.
112 stars. No commits in the last 6 months.
Use this if you are a data scientist regularly building and deploying machine learning models and frequently encounter issues with model performance or unexpected behavior due to the quality of your training data.
Not ideal if you are looking for ready-to-use software tools or libraries for immediate implementation rather than academic research and theoretical foundations.
Stars
112
Forks
7
Language
—
License
—
Category
Last pushed
Oct 15, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/SJTU-DMTai/awesome-ml-data-quality-papers"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
voxel51/fiftyone
Refine high-quality datasets and visual AI models
academic/awesome-datascience
:memo: An awesome Data Science repository to learn and apply for real world problems.
sacridini/Awesome-Geospatial
Long list of geospatial tools and resources
r0f1/datascience
Curated list of Python resources for data science.
neomatrix369/awesome-ai-ml-dl
Awesome Artificial Intelligence, Machine Learning and Deep Learning as we learn it. Study notes...