jaydeepthik/Nano-GPT

Simple GPT with multiheaded attention for char level tokens, inspired from Andrej Karpathy's video lectures : https://github.com/karpathy/ng-video-lecture

12
/ 100
Experimental

This project helps machine learning practitioners or researchers interested in understanding the fundamental building blocks of generative pre-trained transformers (GPTs). It allows you to input text data and observe how a simplified GPT model learns to predict the next characters, producing new text based on the patterns it has learned. This is ideal for those who want to grasp the inner workings of models like ChatGPT at a foundational level.

No commits in the last 6 months.

Use this if you are a machine learning student or researcher seeking to learn and experiment with the core architecture of a GPT model through a simplified implementation.

Not ideal if you are looking for a production-ready, high-performance language model or a tool to perform advanced natural language processing tasks out-of-the-box.

AI-education NLP-research transformer-architecture generative-AI machine-learning-fundamentals
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 4 / 25
Maturity 8 / 25
Community 0 / 25

How are scores calculated?

Stars

7

Forks

Language

Jupyter Notebook

License

Last pushed

May 10, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/jaydeepthik/Nano-GPT"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.