lengstrom/flashback
A FlashAttention backwards-over-backwards β‘ππ
This project helps machine learning engineers and researchers accelerate advanced model training techniques for attention-based models like Transformers. It provides optimized components for calculating the 'backwards-over-backwards' pass, a specialized form of gradient computation. Input consists of attention mechanism tensors (Query, Key, Value), and the output provides highly efficient, memory-optimized second-order gradients, enabling faster research in areas like meta-learning and hyperparameter optimization.
No commits in the last 6 months.
Use this if you are a machine learning researcher or engineer experimenting with metalearning, hyperparameter optimization, or architecture search, and need to compute higher-order gradients for attention-based models more efficiently.
Not ideal if you only need standard first-order gradients or are working with models that don't extensively use attention mechanisms, as the specialized optimizations may not provide significant benefits.
Stars
10
Forks
1
Language
Jupyter Notebook
License
—
Category
Last pushed
Mar 28, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/lengstrom/flashback"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
triton-inference-server/server
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
gpu-mode/Triton-Puzzles
Puzzles for learning Triton
hailo-ai/hailo_model_zoo
The Hailo Model Zoo includes pre-trained models and a full building and evaluation environment
open-mmlab/mmdeploy
OpenMMLab Model Deployment Framework
hyperai/tvm-cn
TVM Documentation in Chinese Simplified / TVM δΈζζζ‘£