yongzhuo/MacroGPT-Pretrain

macrogpt大模型全量预训练(1b3,32层), 多卡deepspeed/单卡adafactor

/ 100

Emerging

This project helps machine learning engineers or researchers pre-train large language models (LLMs) from scratch or continue training existing ones. It takes large text datasets, such as encyclopedias or domain-specific corpuses, and outputs a trained 1.3 billion-parameter GPT-style language model. The primary users are those working on developing or customizing LLMs for specific applications.

No commits in the last 6 months.

Use this if you need to pre-train a 1.3 billion-parameter GPT-style language model using large datasets and require support for multi-GPU or single-GPU training configurations.

Not ideal if you are looking for a fine-tuned model for immediate use or if your primary goal is to use an existing LLM for inference rather than training one.

large-language-models natural-language-processing machine-learning-engineering model-pretraining deep-learning-research

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 6 / 25

Maturity 16 / 25

Community 14 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

Tencent/PatrickStar

PatrickStar enables Larger, Faster, Greener Pretrained Models for NLP and democratizes AI for everyone.

OpenMotionLab/MotionGPT

[NeurIPS 2023] MotionGPT: Human Motion as a Foreign Language, a unified motion-language...

wenlu-lab/cMolGPT

GPT (Generative Pre-trained Transformer) for de novo molecular design by enforcing specified targets

OpenMotionLab/MotionGPT3

MotionGPT3: Human Motion as a Second Modality, a MoT-based framework for unified motion...

SmerkyG/gptcore

Fast modular code to create and train cutting edge LLMs

Explore LLM Tools

All categories Trending LLM Tool directory Insights