sgl-project/sglang
SGLang is a high-performance serving framework for large language models and multimodal models.
This project helps developers and MLOps engineers efficiently deploy and manage large language and multimodal AI models. It takes trained AI models and hardware resources as input, then optimizes their performance to deliver faster and more cost-effective AI inference. It's designed for technical professionals building and operating AI-powered applications.
24,410 stars. Used by 5 other packages. Actively maintained with 994 commits in the last 30 days. Available on PyPI.
Use this if you need to serve large language models or multimodal models with high performance, low latency, and broad hardware compatibility.
Not ideal if you are looking for an off-the-shelf AI application or a framework for training models from scratch.
Stars
24,410
Forks
4,799
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 13, 2026
Commits (30d)
994
Dependencies
64
Reverse dependents
5
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/sgl-project/sglang"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Recent Releases
Compare
Related models
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
alibaba/MNN
MNN: A blazing-fast, lightweight inference engine battle-tested by Alibaba, powering...
xorbitsai/inference
Swap GPT for any LLM by changing a single line of code. Xinference lets you run open-source,...
tensorzero/tensorzero
TensorZero is an open-source stack for industrial-grade LLM applications. It unifies an LLM...
tenstorrent/tt-metal
:metal: TT-NN operator library, and TT-Metalium low level kernel programming model.