BICLab/MetaLA
Offical implementation of "MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map" (NeurIPS2024 Oral)
This project offers an improved method for building large language models (LLMs) and other AI models that process sequential data. It takes raw text or other data sequences and produces a more efficient and effective AI model ready for tasks like language generation or classification. It is designed for AI researchers and machine learning engineers developing next-generation foundation models.
No commits in the last 6 months.
Use this if you are a researcher or engineer working on large AI models and need a more efficient and performant 'attention' mechanism than current linear models, especially for long sequences.
Not ideal if you are looking for an off-the-shelf AI model or a simple tool for immediate application, as this is a foundational component for building such models.
Stars
35
Forks
2
Language
Python
License
—
Category
Last pushed
Jan 18, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/BICLab/MetaLA"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
fla-org/flash-linear-attention
🚀 Efficient implementations of state-of-the-art linear attention models
thu-ml/SageAttention
[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x...
thu-ml/SpargeAttn
[ICML2025] SpargeAttention: A training-free sparse attention that accelerates any model inference.
fla-org/flame
🔥 A minimal training framework for scaling FLA models
foundation-model-stack/fms-fsdp
🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for...