flash-linear-attention and ring-sliding-window-attention

Maintenance 20/25
Adoption 11/25
Maturity 25/25
Community 20/25
Maintenance 2/25
Adoption 5/25
Maturity 15/25
Community 0/25
Stars: 4,549
Forks: 431
Downloads:
Commits (30d): 29
Language: Python
License: MIT
Stars: 9
Forks:
Downloads:
Commits (30d): 0
Language: Python
License: MIT
No risk flags
Stale 6m No Package No Dependents

About flash-linear-attention

fla-org/flash-linear-attention

🚀 Efficient implementations of state-of-the-art linear attention models

This project offers highly optimized building blocks for developing next-generation AI models that can process very long sequences of information efficiently. It provides ready-to-use implementations of advanced 'linear attention' and 'state space' model architectures. AI researchers and machine learning engineers can use these components to create more powerful and scalable models for tasks like natural language understanding or time-series prediction.

AI-model-development large-language-models sequence-modeling deep-learning-optimization AI-research

About ring-sliding-window-attention

XunhaoLai/ring-sliding-window-attention

Ring sliding window attention implementation with flash attention

This is a specialized tool for machine learning engineers working on large language models. It helps train models more efficiently on very long text sequences by distributing the attention mechanism across multiple GPUs. You input the model's query, key, and value tensors, and it outputs the attention results, enabling faster training for long contexts.

large-language-models distributed-training deep-learning natural-language-processing GPU-acceleration

Scores updated daily from GitHub, PyPI, and npm data. How scores work