diffgram/awesome-training-data

Curated list of Awesome Training Data! (Data Labeling, Annotation, Discovery, Workflow etc)

24
/ 100
Experimental

This resource helps machine learning engineers, data scientists, and project managers discover tools and platforms for preparing high-quality data to train AI models. It curates various solutions for tasks like labeling images, annotating video, or transcribing audio, taking raw data and turning it into structured datasets ready for model training. The end users are professionals building or managing AI/ML projects.

No commits in the last 6 months.

Use this if you are an AI/ML practitioner looking for a comprehensive list of open-source and commercial tools to efficiently label, annotate, and manage training data for your machine learning models across different data types.

Not ideal if you are looking for a step-by-step tutorial or an actual software tool to perform data labeling directly, as this is a curated list of external resources.

machine-learning-engineering data-annotation data-labeling computer-vision natural-language-processing
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 5 / 25
Maturity 8 / 25
Community 11 / 25

How are scores calculated?

Stars

12

Forks

2

Language

License

Last pushed

May 24, 2022

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/diffgram/awesome-training-data"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.