AliHaiderAhmad001/GPT-from-Scratch-with-Tensorflow
Implementation for "Improving Language Understanding by Generative Pre-Training" paper
This project helps machine learning engineers and researchers understand how foundational language models work. It provides a complete, working example of the original GPT model built from scratch. By examining and modifying this code, you can learn the core components of text generation, taking raw text as input and producing new, contextually relevant text.
Use this if you are a machine learning engineer or researcher who wants to deeply understand the architecture and inner workings of generative pre-trained transformer models for educational purposes or to build custom components.
Not ideal if you need a ready-to-use, high-performance language model for large-scale production applications or if you just want to apply an existing model without diving into its implementation details.
Stars
19
Forks
5
Language
Python
License
MIT
Category
Last pushed
Mar 12, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/AliHaiderAhmad001/GPT-from-Scratch-with-Tensorflow"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related models
HomebrewML/HomebrewNLP-torch
A case study of efficient training of large language models using commodity hardware.
akshat0123/GPT-1
Pytorch implementation of GPT-1
qiqiApink/MotionGPT
The official PyTorch implementation of the paper "MotionGPT: Finetuned LLMs are General-Purpose...
nawnoes/pytorch-gpt-x
An implementation of an autoregressive language model using an improved Transformer and...
Shenggan/atp
Adaptive Tensor Parallelism for Foundation Models