lengstrom/flashback

A FlashAttention backwards-over-backwards βš‘πŸ”™πŸ”™

20
/ 100
Experimental

This project helps machine learning engineers and researchers accelerate advanced model training techniques for attention-based models like Transformers. It provides optimized components for calculating the 'backwards-over-backwards' pass, a specialized form of gradient computation. Input consists of attention mechanism tensors (Query, Key, Value), and the output provides highly efficient, memory-optimized second-order gradients, enabling faster research in areas like meta-learning and hyperparameter optimization.

No commits in the last 6 months.

Use this if you are a machine learning researcher or engineer experimenting with metalearning, hyperparameter optimization, or architecture search, and need to compute higher-order gradients for attention-based models more efficiently.

Not ideal if you only need standard first-order gradients or are working with models that don't extensively use attention mechanisms, as the specialized optimizations may not provide significant benefits.

deep-learning-research attention-mechanisms meta-learning hyperparameter-optimization gradient-descent-optimization
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 5 / 25
Maturity 8 / 25
Community 7 / 25

How are scores calculated?

Stars

10

Forks

1

Language

Jupyter Notebook

License

Last pushed

Mar 28, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/lengstrom/flashback"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.