kamalkraj/minGPT-TF
A minimal TF2 re-implementation of the OpenAI GPT training
This project helps machine learning practitioners or researchers understand and implement the core components of GPT-like models using TensorFlow. It takes a sequence of numerical tokens (representing text or other discrete data) and outputs a probability distribution for the next token in the sequence. Data scientists, AI researchers, or students learning about generative models would find this useful for experimenting with foundational transformer architectures.
No commits in the last 6 months.
Use this if you want a clear, minimal, and educational implementation of a GPT model's training process in TensorFlow.
Not ideal if you need to train very large-scale GPT-3 like models that require extensive distributed training and memory management beyond typical GPU limits.
Stars
58
Forks
18
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Sep 01, 2021
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/kamalkraj/minGPT-TF"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Higher-rated alternatives
LowinLi/transformers-stream-generator
This is a text generation method which returns a generator, streaming out each token in...
ystemsrx/mini-nanoGPT
One-click training of your own GPT. Training a GPT has never been easier for beginners. /...
jaymody/picoGPT
An unnecessarily tiny implementation of GPT-2 in NumPy.
kyegomez/AttentionGrid
A network of attention mechanisms at your fingertips. Unleash the potential of attention...
abhaskumarsinha/MinimalGPT
MinimalGPT is a concise, adaptable, and streamlined code framework that encompasses the...