1b5d/llm-api
Run any Large Language Model behind a unified API
This project provides a straightforward way to run various Large Language Models (LLMs) on your own hardware, whether in Docker or directly on your machine. You provide a simple configuration file specifying the model, and it handles downloading and running it, making it accessible through a unified API. Developers, researchers, and creators looking to integrate LLMs into their applications can use this to get text generation and embeddings.
171 stars. No commits in the last 6 months.
Use this if you are a developer, researcher, or creator who needs to self-host and interact with different open-source Large Language Models through a consistent API without dealing with complex setup.
Not ideal if you prefer to use managed cloud-based LLM services or do not have the technical expertise to work with Docker and API endpoints.
Stars
171
Forks
28
Language
Python
License
MIT
Category
Last pushed
Nov 13, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/1b5d/llm-api"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
sgl-project/sglang
SGLang is a high-performance serving framework for large language models and multimodal models.
alibaba/MNN
MNN: A blazing-fast, lightweight inference engine battle-tested by Alibaba, powering...
xorbitsai/inference
Swap GPT for any LLM by changing a single line of code. Xinference lets you run open-source,...
tensorzero/tensorzero
TensorZero is an open-source stack for industrial-grade LLM applications. It unifies an LLM...