thu-pacman/chitu

High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.

/ 100

Verified

Chitu is a production-grade large language model (LLM) inference engine designed to efficiently deploy AI models in real-world business scenarios. It takes trained LLM models and enterprise data as input, then processes them to provide rapid, stable AI-powered responses. This is ideal for AI product managers, machine learning engineers, and MLOps teams looking to bring generative AI applications into production.

3,418 stars. Actively maintained with 111 commits in the last 30 days. Available on PyPI.

Use this if you need to run large language models reliably and efficiently across various hardware, from a single GPU to large-scale clusters, for enterprise-level AI applications.

Not ideal if you are looking for a simple tool for basic LLM experimentation or development and do not require high-performance, scalable, or diverse hardware support.

AI-deployment large-language-models AI-infrastructure enterprise-AI MLOps

Maintenance 22 / 25

Adoption 10 / 25

Maturity 25 / 25

Community 22 / 25

How are scores calculated?

Stars

3,418

Forks

477

Language

Python

License

Apache-2.0

Related tools

sophgo/LLM-TPU

Run generative AI models in sophgo BM1684X/BM1688

NotPunchnox/rkllama

Ollama alternative for Rockchip NPU: An efficient solution for running AI and Deep learning...

Deep-Spark/DeepSparkHub

DeepSparkHub selects hundreds of application algorithms and models, covering various fields of...

howard-hou/VisualRWKV

VisualRWKV is the visual-enhanced version of the RWKV language model, enabling RWKV to handle...

bentoml/llm-inference-handbook

Everything you need to know about LLM inference

Explore LLM Tools

All categories Trending LLM Tool directory Insights