yongzhuo/MacroGPT-Pretrain
macrogpt大模型全量预训练(1b3,32层), 多卡deepspeed/单卡adafactor
This project helps machine learning engineers or researchers pre-train large language models (LLMs) from scratch or continue training existing ones. It takes large text datasets, such as encyclopedias or domain-specific corpuses, and outputs a trained 1.3 billion-parameter GPT-style language model. The primary users are those working on developing or customizing LLMs for specific applications.
No commits in the last 6 months.
Use this if you need to pre-train a 1.3 billion-parameter GPT-style language model using large datasets and require support for multi-GPU or single-GPU training configurations.
Not ideal if you are looking for a fine-tuned model for immediate use or if your primary goal is to use an existing LLM for inference rather than training one.
Stars
15
Forks
3
Language
Python
License
Apache-2.0
Category
Last pushed
Nov 30, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/yongzhuo/MacroGPT-Pretrain"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Tencent/PatrickStar
PatrickStar enables Larger, Faster, Greener Pretrained Models for NLP and democratizes AI for everyone.
OpenMotionLab/MotionGPT
[NeurIPS 2023] MotionGPT: Human Motion as a Foreign Language, a unified motion-language...
wenlu-lab/cMolGPT
GPT (Generative Pre-trained Transformer) for de novo molecular design by enforcing specified targets
OpenMotionLab/MotionGPT3
MotionGPT3: Human Motion as a Second Modality, a MoT-based framework for unified motion...
SmerkyG/gptcore
Fast modular code to create and train cutting edge LLMs