flash-linear-attention and Flash-Sparse-Attention

Linear attention and sparse attention are complementary techniques for reducing transformer computational complexity—linear attention approximates full attention in O(n) time via state-space models, while sparse attention maintains exact attention but only between selected token pairs—so these implementations target different efficiency trade-offs and could be used for different use cases rather than as direct alternatives.

flash-linear-attention

Verified

Flash-Sparse-Attention

Emerging

Maintenance 20/25

Adoption 11/25

Maturity 25/25

Community 20/25

Maintenance 2/25

Adoption 10/25

Maturity 15/25

Community 9/25

Stars: 4,549

Forks: 431

Downloads: —

Commits (30d): 29

Language: Python

License: MIT

Stars: 983

Forks: 14

Downloads: —

Commits (30d): 0

Language: Python

License: Apache-2.0

No risk flags

Stale 6m No Package No Dependents

About flash-linear-attention

fla-org/flash-linear-attention

🚀 Efficient implementations of state-of-the-art linear attention models

This project offers highly optimized building blocks for developing next-generation AI models that can process very long sequences of information efficiently. It provides ready-to-use implementations of advanced 'linear attention' and 'state space' model architectures. AI researchers and machine learning engineers can use these components to create more powerful and scalable models for tasks like natural language understanding or time-series prediction.

AI-model-development large-language-models sequence-modeling deep-learning-optimization AI-research

About Flash-Sparse-Attention

Relaxed-System-Lab/Flash-Sparse-Attention

🚀🚀 Efficient implementations of Native Sparse Attention

This project offers an optimized way to train and run large language models (LLMs) more efficiently. It takes in standard LLM input data and processes it using a more performant attention mechanism, leading to faster computations and reduced memory use. Developers and AI engineers working on LLM training and deployment, especially those dealing with models requiring sparse attention, would find this useful.

Large-Language-Models Deep-Learning-Optimization AI-Infrastructure Model-Training GPU-Computing

Related comparisons

flash-linear-attention and SageAttention flash-linear-attention and flame flash-linear-attention and Star-Attention flash-linear-attention and flash_attention_inference flash-linear-attention and ring-sliding-window-attention

Scores updated daily from GitHub, PyPI, and npm data. How scores work