fla-org/flame
🔥 A minimal training framework for scaling FLA models
This project provides a training framework for creating highly efficient large language models, specifically those using Flash Linear Attention (FLA). It takes raw text datasets, like the FineWeb-Edu corpus, and outputs a trained language model ready for use in various applications. It's designed for machine learning researchers and engineers focused on developing custom, performant language models.
355 stars.
Use this if you are building and training your own large language models with a focus on high efficiency and scalability, especially when working with massive text datasets.
Not ideal if you're looking to simply fine-tune existing, pre-trained models or if you don't need to train models from scratch on large-scale datasets.
Stars
355
Forks
58
Language
Python
License
MIT
Category
Last pushed
Nov 15, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/fla-org/flame"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Related models
fla-org/flash-linear-attention
🚀 Efficient implementations of state-of-the-art linear attention models
thu-ml/SageAttention
[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x...
thu-ml/SpargeAttn
[ICML2025] SpargeAttention: A training-free sparse attention that accelerates any model inference.
foundation-model-stack/fms-fsdp
🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for...
NX-AI/mlstm_kernels
Tiled Flash Linear Attention library for fast and efficient mLSTM Kernels.