kyegomez/SimplifiedTransformers

SimplifiedTransformer simplifies transformer block without affecting training. Skip connections, projection parameters, sequential sub-blocks, and normalization layers are removed. Experimental results confirm similar training speed and performance.

/ 100

Emerging

This project offers a simplified approach to building and training AI models, specifically those based on transformer architectures. It helps machine learning engineers and researchers by taking standard transformer model configurations and reducing their complexity. The output is a more streamlined and efficient transformer model that maintains training speed and performance while consuming fewer computational resources.

Use this if you are a machine learning engineer or researcher looking to experiment with more efficient and stable transformer architectures for your AI models.

Not ideal if you need to strictly adhere to traditional transformer block designs or are looking for a pre-trained model rather than a simplified architecture.

AI-model-development machine-learning-engineering deep-learning-research neural-network-architecture model-optimization

No Package No Dependents

Maintenance 10 / 25

Adoption 6 / 25

Maturity 16 / 25

Community 15 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

huggingface/transformers

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in...

kyegomez/LongNet

Implementation of plug in and play Attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens"

pbloem/former

Simple transformer implementation from scratch in pytorch. (archival, latest version on codeberg)

NVIDIA/FasterTransformer

Transformer related optimization, including BERT, GPT

ARM-software/keyword-transformer

Official implementation of the Keyword Transformer: https://arxiv.org/abs/2104.00769

Explore Transformer Models

All categories Trending Transformer directory Insights