ruska-ai/llm-server
🤖 Open-source LLM server (OpenAI, Ollama, Groq, Anthropic) with support for HTTP, Streaming, Agents, RAG (Deprecated check out Orchestra) ->
This project helps 'prompt engineers' or developers quickly set up a local server to experiment with large language models (LLMs) and build AI applications. It takes your configuration for various LLM providers (like OpenAI, Groq, or local Ollama instances) and data sources (like vector databases for RAG) and provides a unified API endpoint. This allows you to integrate different LLMs, create agents, and add retrieval-augmented generation (RAG) capabilities into your applications without managing each service separately.
No commits in the last 6 months.
Use this if you are a developer looking for a local, unified server to prototype and integrate various LLMs, agents, and RAG capabilities into your applications.
Not ideal if you are a non-technical end-user or if you need a production-ready, actively maintained solution, as this project is deprecated.
Stars
33
Forks
13
Language
TypeScript
License
—
Category
Last pushed
Jun 10, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/ruska-ai/llm-server"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
containers/ramalama
RamaLama is an open-source developer tool that simplifies the local serving of AI models from...
av/harbor
One command brings a complete pre-wired LLM stack with hundreds of services to explore.
RunanywhereAI/runanywhere-sdks
Production ready toolkit to run AI locally
runpod-workers/worker-vllm
The RunPod worker template for serving our large language model endpoints. Powered by vLLM.
foldl/chatllm.cpp
Pure C++ implementation of several models for real-time chatting on your computer (CPU & GPU)