Infini-AI-Lab/vortex_torch

Vortex: A Flexible and Efficient Sparse Attention Framework

38
/ 100
Emerging

Vortex helps AI researchers and engineers develop and deploy custom sparse attention algorithms for large language models (LLMs). It takes in your specifications for a new sparse attention pattern and outputs highly optimized code that runs efficiently on modern inference systems. This tool is for individuals focused on advancing LLM efficiency through novel attention mechanisms.

Use this if you need to rapidly prototype, extend, or deploy custom sparse attention algorithms for LLM inference without dealing with low-level optimizations.

Not ideal if you are an end-user of LLMs and not involved in their underlying architectural research or engineering.

LLM-research AI-efficiency model-optimization deep-learning-engineering
No Package No Dependents
Maintenance 10 / 25
Adoption 8 / 25
Maturity 13 / 25
Community 7 / 25

How are scores calculated?

Stars

49

Forks

3

Language

Python

License

Apache-2.0

Last pushed

Jan 21, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/Infini-AI-Lab/vortex_torch"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.