gagolews/clustering-data-v1

A framework for benchmarking clustering algorithms – Benchmark suite, version 1

/ 100

Emerging

This project provides a collection of standardized datasets for evaluating how well different clustering algorithms perform. It takes raw, unlabeled numerical data and offers corresponding 'true' cluster assignments, allowing researchers to rigorously compare the accuracy and efficiency of various clustering methods. Data scientists and machine learning researchers use this to test and improve their clustering techniques.

No commits in the last 6 months.

Use this if you need reliable, diverse datasets with known ground-truth cluster labels to benchmark or develop new clustering algorithms.

Not ideal if you are looking for code to implement clustering algorithms or an automated tool for running benchmarks; this provides only the datasets.

data-science-research machine-learning-evaluation clustering-benchmarking algorithm-testing

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 5 / 25

Maturity 16 / 25

Community 8 / 25

How are scores calculated?

Stars

Forks

Language

Jupyter Notebook

License

—

Higher-rated alternatives

scikit-learn-contrib/hdbscan

A high performance implementation of HDBSCAN clustering.

annoviko/pyclustering

pyclustering is a Python, C++ data mining library.

panagiotisanagnostou/HiPart

Hierarchical divisive clustering algorithm execution, visualization and Interactive visualization.

erdogant/clusteval

Clusteval provides methods for unsupervised cluster validation

mqcomplab/MDANCE

MDANCE: O(N) clustering for molecular dynamics. Process 1.5M frames in 40min. 8 specialized algorithms.

Explore ML Frameworks

All categories Trending ML Framework directory Insights