OpenMachine-ai/transformer-tricks

A collection of tricks and tools to speed up transformer models

/ 100

Established

This project offers methods to streamline and accelerate large language models, especially those built on the transformer architecture. It takes existing transformer model implementations and applies optimizations, resulting in faster execution and reduced memory usage. This is for machine learning engineers and researchers who are developing, deploying, or fine-tuning transformer-based AI models.

197 stars. Available on PyPI.

Use this if you are working with transformer models and need to improve their speed, reduce their memory footprint, or make them more computationally efficient.

Not ideal if you are looking for a pre-trained model or a high-level API for natural language processing without needing to delve into architectural optimizations.

large-language-models model-optimization deep-learning-performance neural-network-efficiency AI-model-deployment

Maintenance 10 / 25

Adoption 10 / 25

Maturity 25 / 25

Community 11 / 25

How are scores calculated?

Stars

197

Forks

Language

TeX

License

MIT

Related models

huggingface/text-generation-inference

Large Language Model Text Generation Inference

poloclub/transformer-explainer

Transformer Explained Visually: Learn How LLM Transformer Models Work with Interactive Visualization

IBM/TabFormer

Code & Data for "Tabular Transformers for Modeling Multivariate Time Series" (ICASSP, 2021)

tensorgi/TPA

[NeurIPS 2025 Spotlight] TPA: Tensor ProducT ATTenTion Transformer (T6)...

lorenzorovida/FHE-BERT-Tiny

Source code for the paper "Transformer-based Language Models and Homomorphic Encryption: an...

Explore Transformer Models

All categories Trending Transformer directory Insights