whylabs/whylogs
An open-source data logging library for machine learning models and data pipelines. 📚 Provides visibility into data quality & model performance over time. 🛡️ Supports privacy-preserving data collection, ensuring safety & robustness. 📈
This tool helps data scientists and machine learning engineers understand and track the quality and behavior of their datasets and models over time. You input your raw data, such as a CSV file or a Pandas DataFrame, and it generates a comprehensive summary, known as a 'whylogs profile.' These profiles allow you to monitor for data changes, identify quality issues, and ensure your machine learning systems are robust and reliable.
2,801 stars. No commits in the last 6 months.
Use this if you need to continuously monitor the health, quality, and statistical properties of your data inputs or machine learning models in production, especially to detect data drift, data quality issues, or performance degradation.
Not ideal if you are looking for a tool to train machine learning models or perform complex statistical analyses beyond data profiling and anomaly detection.
Stars
2,801
Forks
133
Language
Jupyter Notebook
License
Apache-2.0
Category
Last pushed
Jan 10, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/mlops/whylabs/whylogs"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
netdata/netdata
The fastest path to AI-powered full stack observability, even for lean teams.
pixie-io/pixie
Instant Kubernetes-Native Application Observability
keikoproj/active-monitor
Provides deep monitoring and self-healing of Kubernetes clusters
sloopstash/kickstart-elk
Collect Telemetry data from a variety of platforms, workloads, and services to implement...
numaproj/numalogic-prometheus
AIOps for metrics in Prometheus