deepanwadhwa/nanogpt-Audio
An experimental nanogpt fork that learns to speak Shakespeare by modeling EnCodec audio tokens.
This project helps audio engineers and researchers experiment with training transformer models on raw audio. You provide text, which is converted to spoken audio, and then this audio is processed into tokens. The system then learns to generate new speech patterns from these audio tokens, effectively creating speech in a learned style.
Use this if you are an audio researcher or sound designer interested in exploring generative audio models specifically trained on spoken language.
Not ideal if you need a production-ready text-to-speech system or a tool to generate audio from diverse input styles beyond a single learned voice.
Stars
13
Forks
—
Language
Python
License
MIT
Category
Last pushed
Dec 31, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/deepanwadhwa/nanogpt-Audio"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
vixhal-baraiya/microgpt-c
The most atomic way to train and inference a GPT in pure, dependency-free C
milanm/AutoGrad-Engine
A complete GPT language model (training and inference) in ~600 lines of pure C#, zero dependencies
LeeSinLiang/microGPT
Implementation of GPT from scratch. Design to be lightweight and easy to modify.
dubzdubz/microgpt-ts
A complete GPT built from scratch in TypeScript with zero dependencies
biegehydra/NanoGptDotnet
A miniature large language model (LLM) that generates shakespeare like text written in C#....