tommyip/mamba2-minimal

Minimal Mamba-2 implementation in PyTorch

/ 100

Emerging

This project offers a highly efficient way to build language models for tasks like text generation or sequence processing, without the computational overhead of traditional Transformer models. It takes in sequential data (like text or time series) and processes it into output logits, enabling rapid training and constant-time inference, especially useful for very long sequences. It's designed for machine learning practitioners and researchers working with sequential data.

243 stars. No commits in the last 6 months.

Use this if you need to develop or experiment with cutting-edge foundation models for sequential data that are faster and more memory-efficient than Transformer architectures, particularly for long sequences.

Not ideal if you are looking for a pre-trained, ready-to-use application and not a foundational building block for model development.

natural-language-processing machine-learning-research sequence-modeling deep-learning-optimization

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 11 / 25

How are scores calculated?

Stars

243

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

ZHZisZZ/dllm

dLLM: Simple Diffusion Language Modeling

pengzhangzhi/Open-dLLM

Open diffusion language model for code generation — releasing pretraining, evaluation,...

EnnengYang/Awesome-Model-Merging-Methods-Theories-Applications

Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities. ACM...

THUDM/LongWriter

[ICLR 2025] LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs

AIoT-MLSys-Lab/SVD-LLM

[ICLR 2025🔥] SVD-LLM & [NAACL 2025🔥] SVD-LLM V2

Explore Transformer Models

All categories Trending Transformer directory Insights