BICLab/MetaLA

Offical implementation of "MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map" (NeurIPS2024 Oral)

/ 100

Experimental

This project offers an improved method for building large language models (LLMs) and other AI models that process sequential data. It takes raw text or other data sequences and produces a more efficient and effective AI model ready for tasks like language generation or classification. It is designed for AI researchers and machine learning engineers developing next-generation foundation models.

No commits in the last 6 months.

Use this if you are a researcher or engineer working on large AI models and need a more efficient and performant 'attention' mechanism than current linear models, especially for long sequences.

Not ideal if you are looking for an off-the-shelf AI model or a simple tool for immediate application, as this is a foundational component for building such models.

large-language-model-development foundation-model-architecture neural-network-efficiency deep-learning-research

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 7 / 25

Maturity 8 / 25

Community 6 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

fla-org/flash-linear-attention

🚀 Efficient implementations of state-of-the-art linear attention models

thu-ml/SageAttention

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x...

thu-ml/SpargeAttn

[ICML2025] SpargeAttention: A training-free sparse attention that accelerates any model inference.

fla-org/flame

🔥 A minimal training framework for scaling FLA models

foundation-model-stack/fms-fsdp

🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for...

Explore Transformer Models

All categories Trending Transformer directory Insights