tomtung/omikuji

An efficient implementation of Partitioned Label Trees & its variations for extreme multi-label classification

41
/ 100
Emerging

This project helps data scientists and machine learning engineers classify documents, images, or other data points into a very large number of categories efficiently. You provide a dataset with inputs and their associated multiple labels, and it produces a trained model that can quickly and accurately assign many relevant labels to new, unseen inputs. It's designed for practitioners dealing with datasets where each item can belong to hundreds or even thousands of categories.

No commits in the last 6 months.

Use this if you need to perform multi-label classification on massive datasets with a huge number of potential labels and want faster training times without sacrificing accuracy.

Not ideal if you are working with small datasets or a limited number of categories, as its specialized efficiency for 'extreme' scenarios won't provide significant benefits.

multi-label classification large-scale data text categorization recommendation systems information retrieval
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 12 / 25
Maturity 16 / 25
Community 13 / 25

How are scores calculated?

Stars

91

Forks

11

Language

Rust

License

MIT

Last pushed

Feb 20, 2024

Monthly downloads

20

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/tomtung/omikuji"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.