cleanlab/cleanlab-studio
Client interface to Cleanlab Studio
This tool helps data professionals find and fix errors in their text, tabular, or image datasets. You can upload your raw data (like CSVs, JSONs, or DataFrames) to Cleanlab Studio, identify and correct mislabeled examples, and then download a refined dataset with improved labels directly into your workflow. It's designed for data scientists, machine learning engineers, and data analysts who need to ensure the quality of their training data.
No commits in the last 6 months. Available on PyPI.
Use this if you are working with a dataset that you suspect contains errors or mislabeling and you need a systematic way to identify and correct them before using the data for analysis or model training.
Not ideal if your primary goal is to build machine learning models from scratch, as this tool focuses specifically on data quality and label correction, not model development itself.
Stars
31
Forks
10
Language
Python
License
MIT
Category
Last pushed
Feb 18, 2025
Commits (30d)
0
Dependencies
18
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/cleanlab/cleanlab-studio"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
NVIDIA-NeMo/Curator
Scalable data pre processing and curation toolkit for LLMs
MigoXLab/dingo
Dingo: A Comprehensive AI Data, Model and Application Quality Evaluation Tool
data-prep-kit/data-prep-kit
Open source project for data preparation for GenAI applications
TheDataStation/pneuma
LLM-Powered Data Discovery System for Tabular Data
jpmorganchase/CodeQuest
CodeQUEST is a generalizable framework which leverages LLMs to iteratively evaluate and enhance...