mlc-llm and llm-deploy

These tools appear to be **competitors**, as both aim to provide solutions for deploying and serving LLMs, with MLC LLM offering a universal engine with ML compilation for broader deployment and `llm-deploy` focusing on specific inference backends like TensorRT-LLM and vLLM.

mlc-llm
62
Established
llm-deploy
28
Experimental
Maintenance 17/25
Adoption 10/25
Maturity 16/25
Community 19/25
Maintenance 10/25
Adoption 6/25
Maturity 8/25
Community 4/25
Stars: 22,185
Forks: 1,960
Downloads:
Commits (30d): 16
Language: Python
License: Apache-2.0
Stars: 22
Forks: 1
Downloads:
Commits (30d): 0
Language: Python
License:
No Package No Dependents
No License No Package No Dependents

About mlc-llm

mlc-ai/mlc-llm

Universal LLM Deployment Engine with ML Compilation

This project helps machine learning engineers efficiently deploy large language models (LLMs) across a wide range of devices and operating systems. You input a trained LLM, and it outputs an optimized, high-performance version that runs natively on various platforms like web browsers, mobile devices (iOS, Android), and different GPUs (NVIDIA, AMD, Apple, Intel). ML engineers who need their LLMs to run directly on end-user hardware, not just in the cloud, would use this.

AI deployment edge AI mobile machine learning LLM optimization cross-platform AI

About llm-deploy

lix19937/llm-deploy

AI Infra LLM infer/ tensorrt-llm/ vllm

This project helps AI infrastructure engineers optimize large language model (LLM) inference. It provides techniques and frameworks to accelerate the processing of LLMs, reducing the time it takes to get responses (latency) and increasing the number of requests handled per second (throughput). The end-users are engineers responsible for deploying and maintaining LLM-powered applications in production environments.

AI-infrastructure LLM-deployment model-serving GPU-optimization inference-acceleration

Related comparisons

Scores updated daily from GitHub, PyPI, and npm data. How scores work