node-llama-cpp and llama-swap
These are complements: node-llama-cpp provides the local LLM inference engine for Node.js applications, while llama-swap manages dynamic model switching across compatible servers, allowing you to use them together to run and swap between multiple models locally.
About node-llama-cpp
withcatai/node-llama-cpp
Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema on the model output on the generation level
This project helps JavaScript and TypeScript developers integrate advanced AI capabilities directly into their applications by running large language models (LLMs) on their own machines. Developers input a language model and prompts, and the tool outputs structured text, function calls, or embeddings, enabling features like smart chatbots, data summarization, or advanced search within their applications. It's designed for developers building AI-powered features without relying on external cloud services.
About llama-swap
mostlygeek/llama-swap
Reliable model swapping for any local OpenAI/Anthropic compatible server - llama.cpp, vllm, etc
This tool helps AI application developers manage multiple local generative AI models on their machines efficiently. It acts as a smart traffic controller, taking in requests for various AI tasks (like text generation, image creation, or speech processing) and automatically routing them to the correct local AI model server. Developers building AI applications will find this useful for testing and deploying different models without manual intervention.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work