kyegomez/TinyGPTV

Simple Implementation of TinyGPTV in super simple Zeta lego blocks

/ 100

Experimental

This project helps machine learning engineers or researchers working with large language models to build and experiment with specific architectural components. It provides pre-built, simple implementations of key elements like multi-head attention, LoRA, and various normalization layers, allowing for rapid prototyping of new model designs. You would use this if you're developing custom large language models or exploring novel neural network architectures, taking raw input data (tensors) and producing processed tensors for further model layers.

No commits in the last 6 months.

Use this if you are a machine learning engineer or researcher designing or experimenting with new large language model architectures and need modular building blocks.

Not ideal if you are looking for a pre-trained, ready-to-use large language model for direct application rather than for architectural development.

large-language-model-development neural-network-architecture model-prototyping deep-learning-research

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 6 / 25

Maturity 16 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

MIT

Higher-rated alternatives

tabularis-ai/be_great

A novel approach for synthesizing tabular data using pretrained large language models

EleutherAI/gpt-neox

An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron...

shibing624/textgen

TextGen: Implementation of Text Generation models, include LLaMA, BLOOM, GPT2, BART, T5, SongNet...

ai-forever/ru-gpts

Russian GPT3 models.

AdityaNG/kan-gpt

The PyTorch implementation of Generative Pre-trained Transformers (GPTs) using Kolmogorov-Arnold...

Explore Transformer Models

All categories Trending Transformer directory Insights