jina-ai/rungpt

An open-source cloud-native of large multi-modal models (LMMs) serving framework.

/ 100

Established

RunGPT simplifies the process of making large language models (LLMs) available for use across various applications. It takes a pre-trained LLM and deploys it on a cluster of GPUs, turning it into a service that other programs can interact with to generate text or engage in chat. This tool is for machine learning engineers or developers who need to host and manage their own LLMs at scale.

165 stars. No commits in the last 6 months. Available on PyPI.

Use this if you need to deploy and manage large language models efficiently across multiple GPUs for high-traffic, low-latency applications.

Not ideal if you are a casual user looking for a simple chatbot interface or don't have access to distributed GPU infrastructure.

LLM deployment model serving cloud infrastructure machine learning operations distributed systems

Stale 6m

Maintenance 0 / 25

Adoption 10 / 25

Maturity 25 / 25

Community 15 / 25

How are scores calculated?

Stars

165

Forks

Language

Python

License

Apache-2.0

Related models

vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

sgl-project/sglang

SGLang is a high-performance serving framework for large language models and multimodal models.

alibaba/MNN

MNN: A blazing-fast, lightweight inference engine battle-tested by Alibaba, powering...

xorbitsai/inference

Swap GPT for any LLM by changing a single line of code. Xinference lets you run open-source,...

tensorzero/tensorzero

TensorZero is an open-source stack for industrial-grade LLM applications. It unifies an LLM...

Explore Transformer Models

All categories Trending Transformer directory Insights