linkedin/isolation-forest

A distributed Spark/Scala implementation of the isolation forest algorithm for unsupervised outlier detection, featuring support for scalable training and ONNX export for easy cross-platform inference.

58
/ 100
Established

This project helps data scientists and machine learning engineers detect unusual patterns or fraudulent activities within very large datasets. You input your large, numerical datasets, and it helps identify the data points that deviate significantly from the norm. This is particularly useful for those working with massive amounts of data in distributed computing environments.

253 stars.

Use this if you need to find anomalies or outliers in extremely large datasets using distributed computing frameworks like Apache Spark.

Not ideal if you are working with smaller datasets or prefer solutions that don't require a distributed processing setup.

fraud-detection anomaly-detection data-science large-scale-analytics machine-learning-engineering
No Package No Dependents
Maintenance 10 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 22 / 25

How are scores calculated?

Stars

253

Forks

54

Language

Scala

License

Last pushed

Mar 12, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/linkedin/isolation-forest"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.