scienxlab/redflag
Safety net for machine learning pipelines. Plays nice with sklearn and pandas.
Building a machine learning model often involves cleaning and preparing your data. This tool acts as an automatic safety net, flagging potential issues in your datasets (like imbalanced categories, unusual values, or data leakage) before they lead to poor model performance. It takes your raw data (features and targets) and alerts you to common pitfalls, helping data scientists and machine learning engineers build more robust models.
No commits in the last 6 months. Available on PyPI.
Use this if you want an automated system to detect common data quality issues and potential problems in your machine learning datasets before training your models.
Not ideal if you need a visual-first data exploration and profiling tool, or if you are looking for a system to monitor model performance after deployment.
Stars
21
Forks
6
Language
Python
License
Apache-2.0
Category
Last pushed
Apr 22, 2024
Commits (30d)
0
Dependencies
3
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/scienxlab/redflag"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
skrub-data/skrub
Machine learning with dataframes
biolab/orange3
🍊 :bar_chart: :bulb: Orange: Interactive data analysis
root-project/root
The official repository for ROOT: analyzing, storing and visualizing big data, scientifically
cleanlab/cleanlab
Cleanlab's open-source library is the standard data-centric AI package for data quality and...
drivendataorg/deon
A command line tool to easily add an ethics checklist to your data science projects.