Renumics/awesome-open-data-centric-ai
Curated list of open source tooling for data-centric AI on unstructured data.
This curated list helps AI practitioners find open-source tools to improve their unstructured datasets for machine learning models. It takes in information about common data challenges (like noise or bias in images, audio, video, time-series, or text) and helps you discover solutions to systematically refine your training data, leading to better-performing AI systems. This is for anyone building or improving AI models using real-world data.
734 stars. No commits in the last 6 months.
Use this if you are building an AI system and need to find open-source tools to systematically improve the quality of your unstructured training data like images, audio, or text.
Not ideal if you are working with tabular data, primarily seeking dedicated data labeling tools, or looking for MLOps infrastructure.
Stars
734
Forks
38
Language
—
License
CC-BY-4.0
Category
Last pushed
Nov 15, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/Renumics/awesome-open-data-centric-ai"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
voxel51/fiftyone
Refine high-quality datasets and visual AI models
academic/awesome-datascience
:memo: An awesome Data Science repository to learn and apply for real world problems.
sacridini/Awesome-Geospatial
Long list of geospatial tools and resources
r0f1/datascience
Curated list of Python resources for data science.
nhivp/Awesome-Embedded
A curated list of awesome embedded programming.