Dao-AILab/flash-attention
Fast and memory-efficient exact attention
23,131 stars. Used by 18 other packages. Actively maintained with 54 commits in the last 30 days. Available on PyPI.
Stars
23,131
Forks
2,583
Language
Python
License
BSD-3-Clause
Last pushed
Apr 04, 2026
Commits (30d)
54
Dependencies
2
Reverse dependents
18
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/Dao-AILab/flash-attention"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related models
wuwangzhang1216/abliterix
Fully automatic censorship removal for language models. LoRA abliteration + Optuna TPE optimization.
lucidrains/deep-cross-attention
Implementation of the proposed DeepCrossAttention by Heddes et al at Google research, in Pytorch
modelscope/mcore-bridge
MCore-Bridge: Providing Megatron-Core model definitions for state-of-the-art large models and...
assembly-automation-hub/repo-governance
⚙️ Reusable GitHub repository governance kit: CI/CD workflows, CodeQL SAST, Dependabot...
zhongkaifu/TensorSharp
A C# inference engine for running large language models (LLMs) locally using GGUF model files....