ruska-ai/llm-server

🤖 Open-source LLM server (OpenAI, Ollama, Groq, Anthropic) with support for HTTP, Streaming, Agents, RAG (Deprecated check out Orchestra) ->

/ 100

Emerging

This project helps 'prompt engineers' or developers quickly set up a local server to experiment with large language models (LLMs) and build AI applications. It takes your configuration for various LLM providers (like OpenAI, Groq, or local Ollama instances) and data sources (like vector databases for RAG) and provides a unified API endpoint. This allows you to integrate different LLMs, create agents, and add retrieval-augmented generation (RAG) capabilities into your applications without managing each service separately.

No commits in the last 6 months.

Use this if you are a developer looking for a local, unified server to prototype and integrate various LLMs, agents, and RAG capabilities into your applications.

Not ideal if you are a non-technical end-user or if you need a production-ready, actively maintained solution, as this project is deprecated.

AI-development prompt-engineering LLM-prototyping AI-application-integration RAG-development

No License Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 7 / 25

Maturity 8 / 25

Community 18 / 25

How are scores calculated?

Stars

Forks

Language

TypeScript

License

—

Higher-rated alternatives

containers/ramalama

RamaLama is an open-source developer tool that simplifies the local serving of AI models from...

av/harbor

One command brings a complete pre-wired LLM stack with hundreds of services to explore.

RunanywhereAI/runanywhere-sdks

Production ready toolkit to run AI locally

runpod-workers/worker-vllm

The RunPod worker template for serving our large language model endpoints. Powered by vLLM.

foldl/chatllm.cpp

Pure C++ implementation of several models for real-time chatting on your computer (CPU & GPU)

Explore LLM Tools

All categories Trending LLM Tool directory Insights