zhanshijinwat/Steel-LLM

Train a 1B LLM with 1T tokens from scratch by personal

38
/ 100
Emerging

Steel-LLM is a project for those who want to build their own custom Chinese large language models (LLMs) from scratch. It provides a complete guide and all necessary code for collecting and processing Chinese text data, then training an LLM. The output is a functional Chinese LLM tailored to specific data, ready for fine-tuning.

791 stars. No commits in the last 6 months.

Use this if you are a machine learning researcher or engineer with access to 8 or more GPUs (like H800 or A100) and want to pre-train a Chinese LLM from the ground up, rather than simply using an existing model.

Not ideal if you are looking for a ready-to-use LLM for immediate application without extensive training, or if you do not have significant GPU resources and expertise in LLM training.

large-language-model-training natural-language-processing machine-learning-engineering AI-research computational-linguistics
No License Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 10 / 25
Maturity 8 / 25
Community 18 / 25

How are scores calculated?

Stars

791

Forks

78

Language

Jupyter Notebook

License

Last pushed

Apr 27, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/zhanshijinwat/Steel-LLM"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.