thu-pacman/chitu
High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.
Chitu is a production-grade large language model (LLM) inference engine designed to efficiently deploy AI models in real-world business scenarios. It takes trained LLM models and enterprise data as input, then processes them to provide rapid, stable AI-powered responses. This is ideal for AI product managers, machine learning engineers, and MLOps teams looking to bring generative AI applications into production.
3,418 stars. Actively maintained with 111 commits in the last 30 days. Available on PyPI.
Use this if you need to run large language models reliably and efficiently across various hardware, from a single GPU to large-scale clusters, for enterprise-level AI applications.
Not ideal if you are looking for a simple tool for basic LLM experimentation or development and do not require high-performance, scalable, or diverse hardware support.
Stars
3,418
Forks
477
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 13, 2026
Commits (30d)
111
Dependencies
21
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/thu-pacman/chitu"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
sophgo/LLM-TPU
Run generative AI models in sophgo BM1684X/BM1688
NotPunchnox/rkllama
Ollama alternative for Rockchip NPU: An efficient solution for running AI and Deep learning...
Deep-Spark/DeepSparkHub
DeepSparkHub selects hundreds of application algorithms and models, covering various fields of...
howard-hou/VisualRWKV
VisualRWKV is the visual-enhanced version of the RWKV language model, enabling RWKV to handle...
bentoml/llm-inference-handbook
Everything you need to know about LLM inference