alibaba/MNN
MNN: A blazing-fast, lightweight inference engine battle-tested by Alibaba, powering high-performance on-device LLMs and Edge AI.
This project helps developers integrate advanced AI capabilities, like large language models and image generation, directly into applications running on mobile phones, PCs, or IoT devices. It takes pre-trained AI models as input and delivers optimized, high-performance inference outputs, enabling features like offline AI chatbots or on-device image editing. This is for software engineers and product developers building AI-powered applications for edge devices.
14,526 stars. Actively maintained with 52 commits in the last 30 days. Available on PyPI.
Use this if you are a developer looking to embed performant AI features, such as conversational AI or creative image tools, directly into your mobile, desktop, or IoT applications without relying on cloud services.
Not ideal if you are an end-user simply looking for an AI application to use, rather than a developer building one.
Stars
14,526
Forks
2,234
Language
C++
License
Apache-2.0
Category
Last pushed
Mar 13, 2026
Commits (30d)
52
Dependencies
1
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/alibaba/MNN"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Recent Releases
Compare
Related models
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
sgl-project/sglang
SGLang is a high-performance serving framework for large language models and multimodal models.
xorbitsai/inference
Swap GPT for any LLM by changing a single line of code. Xinference lets you run open-source,...
tensorzero/tensorzero
TensorZero is an open-source stack for industrial-grade LLM applications. It unifies an LLM...
tenstorrent/tt-metal
:metal: TT-NN operator library, and TT-Metalium low level kernel programming model.