mlc-ai/mlc-llm

Universal LLM Deployment Engine with ML Compilation

/ 100

Established

This project helps machine learning engineers efficiently deploy large language models (LLMs) across a wide range of devices and operating systems. You input a trained LLM, and it outputs an optimized, high-performance version that runs natively on various platforms like web browsers, mobile devices (iOS, Android), and different GPUs (NVIDIA, AMD, Apple, Intel). ML engineers who need their LLMs to run directly on end-user hardware, not just in the cloud, would use this.

22,185 stars. Actively maintained with 16 commits in the last 30 days.

Use this if you need to deploy a large language model to run directly on edge devices, mobile phones, or web browsers, ensuring high performance and broad compatibility.

Not ideal if you primarily deploy LLMs on cloud-based servers or don't require native, optimized performance across diverse hardware.

AI deployment edge AI mobile machine learning LLM optimization cross-platform AI

No Package No Dependents

Maintenance 17 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 19 / 25

How are scores calculated?

Stars

22,185

Forks

1,960

Language

Python

License

Apache-2.0

Recent Releases

v0.1.dev0 29 Apr 2023

Compare

mlc-llm and llm-deploy

Related models

PaddlePaddle/FastDeploy

High-performance Inference and Deployment Toolkit for LLMs and VLMs based on PaddlePaddle

skyzh/tiny-llm

A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny...

ServerlessLLM/ServerlessLLM

Serverless LLM Serving for Everyone.

AXERA-TECH/ax-llm

Explore LLM model deployment based on AXera's AI chips

AmpereComputingAI/ampere_model_library

AML's goal is to make benchmarking of various AI architectures on Ampere CPUs a pleasurable experience :)

Explore Transformer Models

All categories Trending Transformer directory Insights