wudingjian/rkllm_chat
将LLM 模型部署到 Rockchip Rk3588芯片中,在开发板上使用NPU进行推理
This project helps embedded systems developers or hobbyists deploy large language models (LLMs) like Qwen or TinyLLAMA directly onto Rockchip RK3588 development boards. It takes pre-trained LLM models as input and produces an optimized, executable version that runs efficiently using the board's Neural Processing Unit (NPU). This enables local, on-device AI chat functionalities without needing cloud services.
No commits in the last 6 months.
Use this if you are a hardware enthusiast or developer looking to run popular LLMs offline and directly on your Rockchip RK3588 development board for local AI applications.
Not ideal if you are looking for a cloud-based LLM solution, a general-purpose LLM development kit for various hardware, or if you don't have experience with embedded Linux and Docker.
Stars
72
Forks
14
Language
Python
License
—
Category
Last pushed
Oct 06, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/wudingjian/rkllm_chat"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
thu-pacman/chitu
High-performance inference framework for large language models, focusing on efficiency,...
sophgo/LLM-TPU
Run generative AI models in sophgo BM1684X/BM1688
NotPunchnox/rkllama
Ollama alternative for Rockchip NPU: An efficient solution for running AI and Deep learning...
Deep-Spark/DeepSparkHub
DeepSparkHub selects hundreds of application algorithms and models, covering various fields of...
howard-hou/VisualRWKV
VisualRWKV is the visual-enhanced version of the RWKV language model, enabling RWKV to handle...