Data-Centric-AI-Community/awesome-data-centric-ai
Open-Source Software, Tutorials, and Research on Data-Centric AI 🤖
This is a curated collection of resources for anyone developing or managing AI systems. It helps practitioners improve their AI outcomes by focusing on the quality and characteristics of the data used, rather than just the AI model itself. It offers tools and tutorials for tasks like understanding data, creating realistic synthetic data, and accurately labeling data, benefitting data scientists, machine learning engineers, and data quality specialists.
345 stars.
Use this if you are building or deploying AI models and want to improve their performance and reliability by systematically enhancing your datasets.
Not ideal if you are looking for specific AI model architectures or algorithms, as this focuses on data-related aspects rather than model development.
Stars
345
Forks
47
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Feb 10, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/Data-Centric-AI-Community/awesome-data-centric-ai"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Related frameworks
voxel51/fiftyone
Refine high-quality datasets and visual AI models
academic/awesome-datascience
:memo: An awesome Data Science repository to learn and apply for real world problems.
sacridini/Awesome-Geospatial
Long list of geospatial tools and resources
r0f1/datascience
Curated list of Python resources for data science.
nhivp/Awesome-Embedded
A curated list of awesome embedded programming.