InternLM/lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

67
/ 100
Established

Deploying and serving large language models (LLMs) or visual language models (VLMs) can be complex and resource-intensive. This toolkit helps you compress these models and efficiently serve them so you can get more responses per second from your hardware. It takes your existing large language or visual models and outputs an optimized, ready-to-serve model, making it ideal for engineers and MLOps professionals managing AI inference infrastructure.

7,680 stars. Actively maintained with 56 commits in the last 30 days.

Use this if you need to serve large language models or visual language models on your own infrastructure and want to maximize efficiency and throughput while minimizing hardware costs.

Not ideal if you're a casual user looking for a pre-built chatbot or an API-based service, as this tool requires technical expertise to set up and manage.

LLM deployment MLOps AI inference model optimization large language models
No Package No Dependents
Maintenance 22 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 19 / 25

How are scores calculated?

Stars

7,680

Forks

661

Language

Python

License

Apache-2.0

Last pushed

Mar 13, 2026

Commits (30d)

56

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/InternLM/lmdeploy"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.