SinaDBMS/IsolationForest
Anomaly detection on mixed (categorical, numerical and text) datasets using Isolation Forest
This tool helps identify unusual or problematic data points in your datasets, even when they contain a mix of numbers, categories like product types or regions, and descriptive text. It directly processes your raw mixed data to pinpoint anomalies, eliminating the need for complex data transformations. Anyone working with diverse datasets who needs to quickly spot outliers—such as fraud analysts, quality control inspectors, or data scientists—will find this useful.
No commits in the last 6 months.
Use this if your dataset combines numerical, categorical (like 'color' or 'department'), and text-based information, and you need to find rare or suspicious entries without extensive pre-processing.
Not ideal if your dataset contains only numerical features, as standard Isolation Forest implementations would suffice and potentially be more optimized for that specific use case.
Stars
8
Forks
1
Language
Python
License
MIT
Category
Last pushed
Apr 18, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/SinaDBMS/IsolationForest"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
yzhao062/pyod
A Python Library for Outlier and Anomaly Detection, Integrating Classical and Deep Learning Techniques
unit8co/darts
A python library for user-friendly forecasting and anomaly detection on time series.
elki-project/elki
ELKI Data Mining Toolkit
raphaelvallat/antropy
AntroPy: entropy and complexity of (EEG) time-series in Python
Minqi824/ADBench
Official Implement of "ADBench: Anomaly Detection Benchmark", NeurIPS 2022.