mostlygeek/llama-swap
Reliable model swapping for any local OpenAI/Anthropic compatible server - llama.cpp, vllm, etc
This tool helps AI application developers manage multiple local generative AI models on their machines efficiently. It acts as a smart traffic controller, taking in requests for various AI tasks (like text generation, image creation, or speech processing) and automatically routing them to the correct local AI model server. Developers building AI applications will find this useful for testing and deploying different models without manual intervention.
2,775 stars. Actively maintained with 20 commits in the last 30 days.
Use this if you need to run and seamlessly switch between different local generative AI models, such as those compatible with OpenAI or Anthropic APIs, for your applications.
Not ideal if you only ever use a single generative AI model and don't need to swap between them or manage multiple local AI servers.
Stars
2,775
Forks
205
Language
Go
License
MIT
Category
Last pushed
Mar 12, 2026
Commits (30d)
20
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/mostlygeek/llama-swap"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Related models
ludwig-ai/ludwig
Low-code framework for building custom LLMs, neural networks, and other AI models
withcatai/node-llama-cpp
Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema...
mudler/LocalAI
:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and...
zhudotexe/kani
kani (カニ) is a highly hackable microframework for tool-calling language models. (NLP-OSS @ EMNLP 2023)
SciSharp/LLamaSharp
A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.