src-d/kmcuda

Large scale K-means and K-nn implementation on NVIDIA GPU / CUDA

50
/ 100
Established

This tool helps scientists, marketers, or other data practitioners quickly group large datasets into meaningful clusters and find the closest data points. You provide large tables of numerical data, and it outputs cluster assignments for each data point and identifies nearest neighbors efficiently. It is designed for anyone working with very large datasets who needs fast clustering and nearest-neighbor search.

841 stars. No commits in the last 6 months.

Use this if you need to perform K-means clustering or K-nearest neighbors search on massive datasets and have access to NVIDIA GPUs for significantly faster processing.

Not ideal if you do not have NVIDIA GPUs, as its core performance advantage relies on CUDA acceleration, or if your data contains many missing (NaN) values when using the faster 'Yinyang' algorithm.

data-mining customer-segmentation image-recognition bioinformatics pattern-recognition
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 24 / 25

How are scores calculated?

Stars

841

Forks

146

Language

Jupyter Notebook

License

Last pushed

Oct 11, 2022

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/src-d/kmcuda"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.