SinaDBMS/IsolationForest

Anomaly detection on mixed (categorical, numerical and text) datasets using Isolation Forest

28
/ 100
Experimental

This tool helps identify unusual or problematic data points in your datasets, even when they contain a mix of numbers, categories like product types or regions, and descriptive text. It directly processes your raw mixed data to pinpoint anomalies, eliminating the need for complex data transformations. Anyone working with diverse datasets who needs to quickly spot outliers—such as fraud analysts, quality control inspectors, or data scientists—will find this useful.

No commits in the last 6 months.

Use this if your dataset combines numerical, categorical (like 'color' or 'department'), and text-based information, and you need to find rare or suspicious entries without extensive pre-processing.

Not ideal if your dataset contains only numerical features, as standard Isolation Forest implementations would suffice and potentially be more optimized for that specific use case.

fraud-detection quality-control data-auditing intrusion-detection data-analysis
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 4 / 25
Maturity 16 / 25
Community 8 / 25

How are scores calculated?

Stars

8

Forks

1

Language

Python

License

MIT

Last pushed

Apr 18, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/SinaDBMS/IsolationForest"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.