OutofAi/ChitChat

Modal LLM LLama.cpp based model deployment as part of series of Model as a Service (MaaS)

/ 100

Emerging

This project helps developers quickly set up and deploy a custom large language model (LLM) as an inference endpoint. You provide a Llama.cpp-compatible model, and it outputs a web address that allows other applications to send text prompts to your model and receive responses. This is primarily for AI/ML developers or researchers who need to serve their own LLMs.

Use this if you are a developer looking for a straightforward and cost-effective way to deploy a Llama.cpp-compatible LLM and get an API endpoint for it.

Not ideal if you are a non-technical user looking for a ready-to-use chatbot or if you need to deploy models that are not compatible with Llama.cpp.

LLM-deployment API-development ML-infrastructure model-serving AI-developer-tools

No Package No Dependents

Maintenance 10 / 25

Adoption 6 / 25

Maturity 16 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

MIT

Higher-rated alternatives

containers/ramalama

RamaLama is an open-source developer tool that simplifies the local serving of AI models from...

av/harbor

One command brings a complete pre-wired LLM stack with hundreds of services to explore.

RunanywhereAI/runanywhere-sdks

Production ready toolkit to run AI locally

runpod-workers/worker-vllm

The RunPod worker template for serving our large language model endpoints. Powered by vLLM.

foldl/chatllm.cpp

Pure C++ implementation of several models for real-time chatting on your computer (CPU & GPU)

Explore LLM Tools

All categories Trending LLM Tool directory Insights