jsbroks/awesome-dataset-tools
🔧 A curated list of awesome dataset tools
This is a curated collection of tools to help you prepare your data for machine learning and AI projects. It lists various software and services for marking up different types of raw data like images, audio, time series, and text. If you're a data scientist, machine learning engineer, or researcher, this resource helps you find the right solution to transform your raw data into structured, labeled datasets ready for model training.
937 stars. No commits in the last 6 months.
Use this if you need to find a tool to annotate or label data — like drawing bounding boxes on images, transcribing audio, or marking entities in text — to create training datasets for machine learning.
Not ideal if you are looking for tools to perform data analysis, visualization, or other data manipulation tasks beyond labeling.
Stars
937
Forks
129
Language
—
License
MIT
Category
Last pushed
Jun 09, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/jsbroks/awesome-dataset-tools"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
jonathanwvd/awesome-industrial-datasets
A curated collection of public industrial datasets.
leomaurodesenv/game-datasets
:video_game: A curated list of awesome game datasets, and tools to artificial intelligence in games
NTMC-Community/awesome-neural-models-for-semantic-match
A curated list of papers dedicated to neural text (semantic) matching.
haiker2011/awesome-nlp-sentiment-analysis
:book: 收集NLP领域相关的数据集、论文、开源实现,尤其是情感分析、情绪原因识别、评价对象和评价词抽取方面。
maastrichtlawtech/awesome-legal-nlp
📖 A curated list of LegalNLP resources from all around the web.