1b5d/llm-api

Run any Large Language Model behind a unified API

/ 100

Emerging

This project provides a straightforward way to run various Large Language Models (LLMs) on your own hardware, whether in Docker or directly on your machine. You provide a simple configuration file specifying the model, and it handles downloading and running it, making it accessible through a unified API. Developers, researchers, and creators looking to integrate LLMs into their applications can use this to get text generation and embeddings.

171 stars. No commits in the last 6 months.

Use this if you are a developer, researcher, or creator who needs to self-host and interact with different open-source Large Language Models through a consistent API without dealing with complex setup.

Not ideal if you prefer to use managed cloud-based LLM services or do not have the technical expertise to work with Docker and API endpoints.

LLM deployment AI application development natural language processing machine learning research API integration

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 18 / 25

How are scores calculated?

Stars

171

Forks

Language

Python

License

MIT

Higher-rated alternatives

vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

sgl-project/sglang

SGLang is a high-performance serving framework for large language models and multimodal models.

alibaba/MNN

MNN: A blazing-fast, lightweight inference engine battle-tested by Alibaba, powering...

xorbitsai/inference

Swap GPT for any LLM by changing a single line of code. Xinference lets you run open-source,...

tensorzero/tensorzero

TensorZero is an open-source stack for industrial-grade LLM applications. It unifies an LLM...

Explore Transformer Models

All categories Trending Transformer directory Insights