datawhalechina/llm-deploy

大模型/LLM推理和部署理论与实践

39
/ 100
Emerging

This project provides practical guidance and theoretical foundations for deploying large language models (LLMs) into production. It helps turn trained LLMs into live services that can handle user requests efficiently. The output is a robust, optimized LLM serving system. This resource is for algorithm engineers and anyone interested in the technical aspects of deploying LLMs.

381 stars. No commits in the last 6 months.

Use this if you are an algorithm engineer or student needing to understand the end-to-end process of taking a large language model from development to a live, performant service.

Not ideal if you are looking for an introduction to training LLMs or their applications, as this focuses specifically on the deployment and inference stages.

LLM deployment model serving inference optimization AI engineering machine learning operations
No License Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 10 / 25
Maturity 8 / 25
Community 19 / 25

How are scores calculated?

Stars

381

Forks

51

Language

License

Last pushed

Jul 14, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/datawhalechina/llm-deploy"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.