jina-ai/rungpt
An open-source cloud-native of large multi-modal models (LMMs) serving framework.
RunGPT simplifies the process of making large language models (LLMs) available for use across various applications. It takes a pre-trained LLM and deploys it on a cluster of GPUs, turning it into a service that other programs can interact with to generate text or engage in chat. This tool is for machine learning engineers or developers who need to host and manage their own LLMs at scale.
165 stars. No commits in the last 6 months. Available on PyPI.
Use this if you need to deploy and manage large language models efficiently across multiple GPUs for high-traffic, low-latency applications.
Not ideal if you are a casual user looking for a simple chatbot interface or don't have access to distributed GPU infrastructure.
Stars
165
Forks
21
Language
Python
License
Apache-2.0
Category
Last pushed
Sep 05, 2023
Commits (30d)
0
Dependencies
16
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/jina-ai/rungpt"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related models
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
sgl-project/sglang
SGLang is a high-performance serving framework for large language models and multimodal models.
alibaba/MNN
MNN: A blazing-fast, lightweight inference engine battle-tested by Alibaba, powering...
xorbitsai/inference
Swap GPT for any LLM by changing a single line of code. Xinference lets you run open-source,...
tensorzero/tensorzero
TensorZero is an open-source stack for industrial-grade LLM applications. It unifies an LLM...