alibaba/rtp-llm

RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.

70
/ 100
Verified

This is a high-performance engine for deploying large language models (LLMs) in real-world applications. It takes your trained LLM, potentially with multimodal inputs like images and text, and efficiently generates responses for a large number of users. It is designed for engineers and AI product managers responsible for running LLM-powered services like AI assistants or smart search features at scale.

1,065 stars. Actively maintained with 163 commits in the last 30 days.

Use this if you need to run large language models reliably and quickly for many users within a production environment, especially for applications like AI chatbots, intelligent customer support, or advanced search.

Not ideal if you are only experimenting with LLMs on a personal computer or do not require enterprise-grade performance and scalability for your AI applications.

AI-application-deployment LLM-serving AI-platform-operations conversational-AI enterprise-search
No Package No Dependents
Maintenance 22 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 22 / 25

How are scores calculated?

Stars

1,065

Forks

159

Language

Cuda

License

Apache-2.0

Last pushed

Mar 13, 2026

Commits (30d)

163

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/alibaba/rtp-llm"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.