EleutherAI/gpt-neox

An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries

/ 100

Established

This is a specialized toolkit for researchers and engineers who need to train very large language models from scratch, or fine-tune existing ones, using substantial computational resources. It takes raw text data and configuration settings as input, and outputs a custom-trained language model capable of generating human-like text. This is for users operating at the cutting edge of AI, often in academic, industry, or government labs.

7,399 stars.

Use this if you are a researcher or engineer looking to train large-scale language models with billions of parameters using distributed training across multiple GPUs or high-performance computing clusters.

Not ideal if you are looking to run generic inference with existing models or train smaller models; in those cases, simpler libraries like Hugging Face's `transformers` are more appropriate.

large-language-model-training deep-learning-research natural-language-processing high-performance-computing AI-model-development

No Package No Dependents

Maintenance 10 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 22 / 25

How are scores calculated?

Stars

7,399

Forks

1,100

Language

Python

License

Apache-2.0

Compare

gpt-neox and gpt-neo

Related models

tabularis-ai/be_great

A novel approach for synthesizing tabular data using pretrained large language models

shibing624/textgen

TextGen: Implementation of Text Generation models, include LLaMA, BLOOM, GPT2, BART, T5, SongNet...

ai-forever/ru-gpts

Russian GPT3 models.

AdityaNG/kan-gpt

The PyTorch implementation of Generative Pre-trained Transformers (GPTs) using Kolmogorov-Arnold...

zemlyansky/gpt-tfjs

GPT in TensorFlow.js

Explore Transformer Models

All categories Trending Transformer directory Insights