autonlab/aqua
AQuA: A Benchmarking Tool for Label Quality Assessment, NeurIPS'23 D&B
This tool helps machine learning engineers and researchers assess the quality of labels in their datasets. You provide your dataset, and it evaluates different label error detection methods, showing you how well each method identifies mislabeled data. This helps you choose the best strategy to improve your dataset's quality before training your ML models.
No commits in the last 6 months.
Use this if you need to objectively compare and select the most effective methods for identifying and correcting errors in your machine learning dataset labels across various data types.
Not ideal if you're looking for a simple, automated 'fix-all' solution for label errors without wanting to compare different detection methods.
Stars
23
Forks
1
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Oct 17, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/autonlab/aqua"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
open-edge-platform/datumaro
Dataset Management Framework, a Python library and a CLI tool to build, analyze and manage...
explosion/ml-datasets
🌊 Machine learning dataset loaders for testing and example scripts
webdataset/webdataset
A high-performance Python-based I/O system for large (and small) deep learning problems, with...
tensorflow/datasets
TFDS is a collection of datasets ready to use with TensorFlow, Jax, ...
mlcommons/croissant
Croissant is a high-level format for machine learning datasets that brings together four rich layers.