NotPunchnox/rkllama
Ollama alternative for Rockchip NPU: An efficient solution for running AI and Deep learning models on Rockchip devices with optimized NPU support ( rkllm )
RKLLama helps you run large language models (LLMs) and other AI models like image generators and speech-to-text on specialized Rockchip devices. It takes your input text, images, or audio and processes it using models optimized for your device's Neural Processing Unit (NPU), delivering quick AI-powered responses or content. This is for developers, tinkerers, or embedded system enthusiasts building AI applications on Rockchip RK3588(S) or RK3576 hardware.
447 stars.
Use this if you need to deploy and manage AI models on Rockchip-powered single-board computers, leveraging their NPU for faster and more efficient inference.
Not ideal if you're looking for a cloud-based AI solution or if you don't have specific Rockchip RK3588(S) or RK3576 hardware.
Stars
447
Forks
71
Language
Python
License
GPL-3.0
Category
Last pushed
Mar 09, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/NotPunchnox/rkllama"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
thu-pacman/chitu
High-performance inference framework for large language models, focusing on efficiency,...
sophgo/LLM-TPU
Run generative AI models in sophgo BM1684X/BM1688
Deep-Spark/DeepSparkHub
DeepSparkHub selects hundreds of application algorithms and models, covering various fields of...
howard-hou/VisualRWKV
VisualRWKV is the visual-enhanced version of the RWKV language model, enabling RWKV to handle...
bentoml/llm-inference-handbook
Everything you need to know about LLM inference