cleanlab/label-errors

🛠️ Corrected Test Sets for ImageNet, MNIST, CIFAR, Caltech-256, QuickDraw, IMDB, Amazon Reviews, 20News, and AudioSet

/ 100

Emerging

This project helps machine learning researchers and practitioners evaluate their models more accurately by providing tools to identify and correct mislabeled examples in popular benchmark test datasets like ImageNet, MNIST, and CIFAR. It takes original test data and labels as input, along with model predictions, and outputs identified label errors and corrected labels. This is for anyone training and testing machine learning models who needs reliable evaluation metrics.

187 stars.

Use this if you need to ensure the quality and integrity of your model's evaluation by identifying and correcting label errors in standard ML benchmark test sets.

Not ideal if you are looking for a fully pre-corrected, single-file test set for immediate download without any customization options.

machine-learning-evaluation dataset-quality model-benchmarking data-labeling computer-vision

No Package No Dependents

Maintenance 6 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 10 / 25

How are scores calculated?

Stars

187

Forks

Language

—

License

Apache-2.0

Higher-rated alternatives

open-edge-platform/datumaro

Dataset Management Framework, a Python library and a CLI tool to build, analyze and manage...

explosion/ml-datasets

🌊 Machine learning dataset loaders for testing and example scripts

webdataset/webdataset

A high-performance Python-based I/O system for large (and small) deep learning problems, with...

tensorflow/datasets

TFDS is a collection of datasets ready to use with TensorFlow, Jax, ...

mlcommons/croissant

Croissant is a high-level format for machine learning datasets that brings together four rich layers.

Explore ML Frameworks

All categories Trending ML Framework directory Insights