nekomeowww/ollama-operator

🚢 Yet another operator for running large language models on Kubernetes with ease. Powered by Ollama! 🐫

/ 100

Established

This project helps operations engineers and MLOps teams easily deploy and manage multiple large language models (LLMs) on a Kubernetes cluster. You provide the name of an Ollama-compatible model, and the operator handles fetching, loading, and running it as a service. It's designed for those who need to scale their LLM inference capabilities beyond a single machine, integrating seamlessly into existing Kubernetes infrastructure.

234 stars.

Use this if you are an operations engineer or MLOps specialist managing a Kubernetes environment and need to deploy and scale various large language models for different applications or teams.

Not ideal if you only need to run LLMs locally on a single machine or are not working with Kubernetes infrastructure.

MLOps Kubernetes deployment LLM inference model serving cloud infrastructure

No Package No Dependents

Maintenance 10 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 16 / 25

How are scores calculated?

Stars

234

Forks

Language

License

Apache-2.0

Related models

ludwig-ai/ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models

withcatai/node-llama-cpp

Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema...

mudler/LocalAI

:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and...

zhudotexe/kani

kani (カニ) is a highly hackable microframework for tool-calling language models. (NLP-OSS @ EMNLP 2023)

SciSharp/LLamaSharp

A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.

Explore Transformer Models

All categories Trending Transformer directory Insights