AliHaiderAhmad001/GPT-from-Scratch-with-Tensorflow

Implementation for "Improving Language Understanding by Generative Pre-Training" paper

/ 100

Emerging

This project helps machine learning engineers and researchers understand how foundational language models work. It provides a complete, working example of the original GPT model built from scratch. By examining and modifying this code, you can learn the core components of text generation, taking raw text as input and producing new, contextually relevant text.

Use this if you are a machine learning engineer or researcher who wants to deeply understand the architecture and inner workings of generative pre-trained transformer models for educational purposes or to build custom components.

Not ideal if you need a ready-to-use, high-performance language model for large-scale production applications or if you just want to apply an existing model without diving into its implementation details.

natural-language-processing deep-learning AI-research language-modeling text-generation

No Package No Dependents

Maintenance 10 / 25

Adoption 6 / 25

Maturity 16 / 25

Community 15 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Related models

HomebrewML/HomebrewNLP-torch

A case study of efficient training of large language models using commodity hardware.

akshat0123/GPT-1

Pytorch implementation of GPT-1

qiqiApink/MotionGPT

The official PyTorch implementation of the paper "MotionGPT: Finetuned LLMs are General-Purpose...

nawnoes/pytorch-gpt-x

An implementation of an autoregressive language model using an improved Transformer and...

Shenggan/atp

Adaptive Tensor Parallelism for Foundation Models

Explore Transformer Models

All categories Trending Transformer directory Insights