yongzhuo/MacroGPT-Pretrain

macrogpt大模型全量预训练(1b3,32层), 多卡deepspeed/单卡adafactor

36
/ 100
Emerging

This project helps machine learning engineers or researchers pre-train large language models (LLMs) from scratch or continue training existing ones. It takes large text datasets, such as encyclopedias or domain-specific corpuses, and outputs a trained 1.3 billion-parameter GPT-style language model. The primary users are those working on developing or customizing LLMs for specific applications.

No commits in the last 6 months.

Use this if you need to pre-train a 1.3 billion-parameter GPT-style language model using large datasets and require support for multi-GPU or single-GPU training configurations.

Not ideal if you are looking for a fine-tuned model for immediate use or if your primary goal is to use an existing LLM for inference rather than training one.

large-language-models natural-language-processing machine-learning-engineering model-pretraining deep-learning-research
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 6 / 25
Maturity 16 / 25
Community 14 / 25

How are scores calculated?

Stars

15

Forks

3

Language

Python

License

Apache-2.0

Last pushed

Nov 30, 2023

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/yongzhuo/MacroGPT-Pretrain"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.