januverma/transformers-stuff

Codes, scripts, and notebooks on various aspects of transformer models.

/ 100

Experimental

This project offers educational code and explanations for those learning how transformer models work. It details the inner workings of models like GPT, showing how they process input data and generate text. Aspiring machine learning engineers and researchers can use this to deepen their understanding of foundational AI architectures.

No commits in the last 6 months.

Use this if you are an AI/ML practitioner looking to understand the technical details and implementation of transformer neural networks from the ground up.

Not ideal if you are looking for a ready-to-use tool to solve a specific business problem or deploy a pre-trained model.

Machine-Learning-Education AI-Research Deep-Learning-Fundamentals Natural-Language-Processing Model-Architecture

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 7 / 25

Maturity 8 / 25

Community 12 / 25

How are scores calculated?

Stars

Forks

Language

Jupyter Notebook

License

—

Higher-rated alternatives

huggingface/transformers

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in...

kyegomez/LongNet

Implementation of plug in and play Attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens"

pbloem/former

Simple transformer implementation from scratch in pytorch. (archival, latest version on codeberg)

NVIDIA/FasterTransformer

Transformer related optimization, including BERT, GPT

kyegomez/SimplifiedTransformers

SimplifiedTransformer simplifies transformer block without affecting training. Skip connections,...

Explore Transformer Models

All categories Trending Transformer directory Insights