DagsHub/open-source-ml-datasets
This repository holds open source datasets for various machine learning domains with a link to download and use them
This project helps machine learning practitioners find and utilize open-source datasets for training models. It takes raw data from various online sources and makes it ready for use within the DagsHub platform, complete with metadata. Anyone working on machine learning projects, especially those needing diverse datasets for experiments or model development, would find this useful.
No commits in the last 6 months.
Use this if you are a machine learning practitioner looking for well-organized, accessible open-source datasets for your projects.
Not ideal if you are looking for custom, proprietary datasets or tools for advanced dataset versioning outside of DagsHub's ecosystem.
Stars
9
Forks
8
Language
—
License
—
Category
Last pushed
Oct 27, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/DagsHub/open-source-ml-datasets"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
open-edge-platform/datumaro
Dataset Management Framework, a Python library and a CLI tool to build, analyze and manage...
explosion/ml-datasets
🌊 Machine learning dataset loaders for testing and example scripts
webdataset/webdataset
A high-performance Python-based I/O system for large (and small) deep learning problems, with...
tensorflow/datasets
TFDS is a collection of datasets ready to use with TensorFlow, Jax, ...
mlcommons/croissant
Croissant is a high-level format for machine learning datasets that brings together four rich layers.