januverma/transformers-stuff
Codes, scripts, and notebooks on various aspects of transformer models.
This project offers educational code and explanations for those learning how transformer models work. It details the inner workings of models like GPT, showing how they process input data and generate text. Aspiring machine learning engineers and researchers can use this to deepen their understanding of foundational AI architectures.
No commits in the last 6 months.
Use this if you are an AI/ML practitioner looking to understand the technical details and implementation of transformer neural networks from the ground up.
Not ideal if you are looking for a ready-to-use tool to solve a specific business problem or deploy a pre-trained model.
Stars
27
Forks
4
Language
Jupyter Notebook
License
—
Category
Last pushed
Feb 27, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/januverma/transformers-stuff"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
huggingface/transformers
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in...
kyegomez/LongNet
Implementation of plug in and play Attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens"
pbloem/former
Simple transformer implementation from scratch in pytorch. (archival, latest version on codeberg)
NVIDIA/FasterTransformer
Transformer related optimization, including BERT, GPT
kyegomez/SimplifiedTransformers
SimplifiedTransformer simplifies transformer block without affecting training. Skip connections,...