NX-AI/mlstm_kernels
Tiled Flash Linear Attention library for fast and efficient mLSTM Kernels.
This library provides highly optimized implementations of mLSTM (mixed Long Short-Term Memory) kernels, essential for training and running advanced AI models. It takes in standard tensor inputs representing model states and produces the next sequence of states, greatly speeding up the core computations of complex neural networks. It is used by AI researchers and machine learning engineers developing or deploying large language models and other sequence-based AI systems.
Use this if you are a machine learning engineer or researcher working with JAX or PyTorch and need to accelerate the training and inference of mLSTM-based models, especially for long sequences.
Not ideal if you are not developing advanced AI models and are looking for a high-level API for general data analysis or traditional machine learning tasks.
Stars
87
Forks
6
Language
Jupyter Notebook
License
—
Category
Last pushed
Mar 01, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/NX-AI/mlstm_kernels"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
fla-org/flash-linear-attention
🚀 Efficient implementations of state-of-the-art linear attention models
thu-ml/SageAttention
[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x...
thu-ml/SpargeAttn
[ICML2025] SpargeAttention: A training-free sparse attention that accelerates any model inference.
fla-org/flame
🔥 A minimal training framework for scaling FLA models
foundation-model-stack/fms-fsdp
🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for...