NX-AI/mlstm_kernels

Tiled Flash Linear Attention library for fast and efficient mLSTM Kernels.

/ 100

Emerging

This library provides highly optimized implementations of mLSTM (mixed Long Short-Term Memory) kernels, essential for training and running advanced AI models. It takes in standard tensor inputs representing model states and produces the next sequence of states, greatly speeding up the core computations of complex neural networks. It is used by AI researchers and machine learning engineers developing or deploying large language models and other sequence-based AI systems.

Use this if you are a machine learning engineer or researcher working with JAX or PyTorch and need to accelerate the training and inference of mLSTM-based models, especially for long sequences.

Not ideal if you are not developing advanced AI models and are looking for a high-level API for general data analysis or traditional machine learning tasks.

AI-model-development deep-learning-optimization neural-network-training large-language-models sequence-modeling

No Package No Dependents

Maintenance 10 / 25

Adoption 9 / 25

Maturity 16 / 25

Community 9 / 25

How are scores calculated?

Stars

Forks

Language

Jupyter Notebook

License

—

Higher-rated alternatives

fla-org/flash-linear-attention

🚀 Efficient implementations of state-of-the-art linear attention models

thu-ml/SageAttention

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x...

thu-ml/SpargeAttn

[ICML2025] SpargeAttention: A training-free sparse attention that accelerates any model inference.

fla-org/flame

🔥 A minimal training framework for scaling FLA models

foundation-model-stack/fms-fsdp

🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for...

Explore Transformer Models

All categories Trending Transformer directory Insights