zjhellofss/KuiperLLama
校招、秋招、春招、实习好项目,带你从零动手实现支持LLama2/3和Qwen2.5的大模型推理框架。
This project offers a hands-on course for building a large language model (LLM) inference framework from scratch. It takes popular LLMs like Llama2/3 and Qwen2.5 as input, and demonstrates how to process them for efficient text generation and inference. This is designed for aspiring AI engineers and computer science students who want to master LLM deployment skills.
509 stars.
Use this if you are a computer science student or an aspiring AI engineer looking to deeply understand and implement LLM inference engines for career advancement, especially in preparing for technical interviews related to large models.
Not ideal if you are an end-user simply looking to apply pre-built LLMs without needing to understand or construct the underlying inference framework.
Stars
509
Forks
130
Language
C++
License
—
Category
Last pushed
Oct 28, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/zjhellofss/KuiperLLama"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
vllm-project/vllm-ascend
Community maintained hardware plugin for vLLM on Ascend
SemiAnalysisAI/InferenceX
Open Source Continuous Inference Benchmarking Qwen3.5, DeepSeek, GPTOSS - GB200 NVL72 vs MI355X...
kvcache-ai/Mooncake
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
uccl-project/uccl
UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache...
sophgo/tpu-mlir
Machine learning compiler based on MLIR for Sophgo TPU.