LLM Docker Deployments LLM Tools

Docker containerization and deployment solutions for running LLMs, inference servers, and related AI services locally or on networks. Does NOT include general containerization tools, Kubernetes orchestration, or non-LLM Docker projects.

There are 141 llm docker deployments tools tracked. 1 score above 70 (verified tier). The highest-rated is containers/ramalama at 79/100 with 2,640 stars. 4 of the top 10 are actively maintained.

Get all 141 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=llm-tools&subcategory=llm-docker-deployments&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Tool	Score	Tier	Stars	Language
1	containers/ramalama RamaLama is an open-source developer tool that simplifies the local serving...	79	Verified	2,640	Python
2	av/harbor One command brings a complete pre-wired LLM stack with hundreds of services...	66	Established	2,498	TypeScript
3	RunanywhereAI/runanywhere-sdks Production ready toolkit to run AI locally	62	Established	10,245	C++
4	runpod-workers/worker-vllm The RunPod worker template for serving our large language model endpoints....	61	Established	406	Python
5	foldl/chatllm.cpp Pure C++ implementation of several models for real-time chatting on your...	59	Established	831	C++
6	FarisZahrani/llama-cpp-py-sync Auto-synced CFFI ABI python bindings for llama.cpp with prebuilt wheels...	58	Established	3	Python
7	vtuber-plan/olah Self-hosted huggingface mirror service. 自建huggingface镜像服务。	57	Established	218	Python
8	quantalogic/qllm QLLM: A powerful CLI for seamless interaction with multiple Large Language...	54	Established	35	TypeScript
9	eastriverlee/LLM.swift LLM.swift is a simple and readable library that allows you to interact with...	53	Established	829	C++
10	varunvasudeva1/llm-server-docs End-to-end documentation to set up your own local & fully private LLM server...	52	Established	719	—
11	dingodb/dingospeed dingospeed is a self-hosted huggingface mirror service	51	Established	30	Go
12	sangyuxiaowu/LLamaWorker LLamaWorker is a HTTP API server developed based on the LLamaSharp project....	49	Emerging	80	C#
13	France-Travail/happy_vllm A REST API for vLLM, production ready	48	Emerging	27	Python
14	Scottcjn/llama-cpp-power8 AltiVec/VSX optimized llama.cpp for IBM POWER8	48	Emerging	47	C
15	lordmathis/llamactl Unified management and routing for llama.cpp, MLX and vLLM models with web dashboard.	47	Emerging	89	Go
16	jlonge4/local_llama This repo is to showcase how you can run a model locally and offline, free...	46	Emerging	298	Python
17	ashleykleynhans/runpod-worker-oobabooga RunPod Serverless Worker for Oobabooga Text Generation API for LLMs	46	Emerging	3	Python
18	liltom-eth/llama2-webui Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere...	46	Emerging	1,945	Jupyter Notebook
19	ai-action/ollama-action 🦙 Run Ollama large language models (LLMs) with GitHub Actions.	43	Emerging	22	—
20	ADT109119/llamacpp-distributed-inference 一個基於 llama.cpp 的分佈式 LLM 推理程式，讓您能夠利用區域網路內的多台電腦協同進行大型語言模型的分佈式推理，使用 Electron...	43	Emerging	71	JavaScript
21	icppWorld/icpp_llm on-chain LLMs	43	Emerging	19	C++
22	hitomi-team/sukima A ready-to-deploy container for implementing an easy to use REST API to...	42	Emerging	66	Python
23	timhagel/MeloTTS-Docker-API-Server A docker image to access MeloTTS through API calls	42	Emerging	56	Python
24	Flowm/llm-stack Docker compose config for local and hosted llms with multiple chat interfaces	42	Emerging	11	Python
25	sinfallas/opendevin-docker Run OpenDevin inside Docker	41	Emerging	24	Dockerfile
26	wsmlby/homl The easiest & fastest way to run LLMs in your home lab	40	Emerging	85	Python
27	feiyun0112/Local-LLM-Server quick way to build a private large language model server and provide...	40	Emerging	34	Python
28	aws-samples/sample-ollama-server Ollama on GPU EC2 instance with Open WebUI web interface and Bedrock access	39	Emerging	25	—
29	EvilFreelancer/docker-llama.cpp-rpc Данный проект основан на llama.cpp и компилирует только RPC-сервер, а так же...	39	Emerging	23	Shell
30	gsuuon/ad-llama Structured inference with Llama 2 in your browser	38	Emerging	52	TypeScript
31	teremterem/litellm-server-boilerplate A lightweight LiteLLM server boilerplate pre-configured with uv and Docker...	38	Emerging	11	Python
32	heyvaldemar/ollama-traefik-letsencrypt-docker-compose Ollama with Let's Encrypt Using Docker Compose	38	Emerging	23	Shell
33	Mcourtyard/m-courtyard M-Courtyard: Local AI Model Fine-tuning Assistant for Apple Silicon....	38	Emerging	67	TypeScript
34	sasha0552/ToriLinux Linux LiveCD for offline AI training and inference.	37	Emerging	19	Jinja
35	rgryta/LLM-WSL2-Docker One-click install for WizardLM-13B-Uncensored with oobabooga webui	37	Emerging	21	PowerShell
36	mitja/llamatunnel Publish local LLMs and LLM apps on the internet.	37	Emerging	27	Jinja
37	ai-action/setup-ollama 🦙 Set up GitHub Actions with Ollama CLI	37	Emerging	12	TypeScript
38	john-rocky/EdgeLLM Simple LLM package for ios devices.	36	Emerging	30	Swift
39	nicksavarese/allora-ios An iOS Keyboard Extension that allows for interacting with LLMs directly...	36	Emerging	52	Swift
40	DanielZhangyc/RLLM LLM powered RSS reader	36	Emerging	89	Swift
41	cdrage/containerfiles Containerfiles including AI, game servers, bootc and even a rickroll.	36	Emerging	38	Dockerfile
42	crowdllama/crowdllama CrowdLlama is a distributed system that leverages the open-source Ollama...	35	Emerging	22	Go
43	linonetwo/MOSS-DockerFile 用于在 Docker 里运行复旦的 MOSS 语言模型，使用 GradIO 提供 WebUI。	35	Emerging	16	Python
44	ruska-ai/llm-server 🤖 Open-source LLM server (OpenAI, Ollama, Groq, Anthropic) with support for...	35	Emerging	33	TypeScript
45	BlackTechX011/Ollama-in-GitHub-Codespaces Learn all how to run Ollama in GitHub Codespaces for free	35	Emerging	44	Jupyter Notebook
46	Jewelzufo/granitepi-4-nano Run IBM Granite 4.0 locally on Raspberry Pi 5 with Ollama.This is a...	35	Emerging	10	Shell
47	asreview/asreview-server-stack Docker compose for setting up ASReview server with authentication	35	Emerging	8	Dockerfile
48	Scottcjn/llama-cpp-tigerleopard WORLD FIRST: llama.cpp for Mac OS X Tiger & Leopard on PowerPC G4/G5	34	Emerging	25	C++
49	soulteary/docker-yi-runtime 零一万物（34B）的本地运行环境。	34	Emerging	9	Dockerfile
50	persys-ai/persys Welcome!	34	Emerging	140	—
51	alex0dd/llm-app-microservices-template Template for building microservice-based apps with a frontend, backend, LLM...	33	Emerging	5	HTML
52	ivangabriele-archives/docker-llm Pre-loaded LLMs served as an OpenAI-Compatible API via Docker images.	33	Emerging	6	Dockerfile
53	codygreen/llm_api_server Lab to demonstrate how to apply an API to an AI model and secure it.	33	Emerging	2	Jupyter Notebook
54	wizzard0/llama2.ts Llama2 inference in one TypeScript file	32	Emerging	20	JavaScript
55	g1ibby/homellm A simple Docker Compose boilerplate for deploying Open WebUI and LiteLLM...	32	Emerging	20	—
56	Malax/buildpack-ollama Cloud Native Buildpack that builds an OCI image with Ollama and a large...	32	Emerging	5	Rust
57	OutofAi/ChitChat Modal LLM LLama.cpp based model deployment as part of series of Model as a...	32	Emerging	17	Python
58	ivangabriele-archives/docker-functionary Ready-to-deploy Docker image for Functionary LLM served as an OpenAI-Compatible API.	32	Emerging	5	Dockerfile
59	AnLaVN/AL-Library Java utility library, contain many feature, support to Large Language Model...	32	Emerging	5	Java
60	m1ns09/Llama 🌐 Run GGUF models directly in your web browser using JavaScript and...	31	Emerging	2	HTML
61	raketenkater/llm-server Smart launcher for llama.cpp / ik_llama.cpp — auto-detects GPUs, optimizes...	31	Emerging	30	Shell
62	JimKw1kX/LLM-C2-Server An AI C2 Server	31	Emerging	3	Python
63	DataJourneyHQ/list-github-models GitHub action to track GitHub Models	31	Emerging	4	—
64	micbi-dt/lmstudio-docker run LMStudio within a Docker container	30	Emerging	19	Dockerfile
65	toku345/dgx-llm-serve Docker Compose configs for running LLM inference on DGX Spark (TensorRT-LLM...	30	Emerging	2	Python
66	openradx/llm_api_server_mock This is a simple fastapi based server mock that implements the OpenAI API.	29	Experimental	1	Jupyter Notebook
67	azer/llmcat Prepare files and directories for LLM consumption	29	Experimental	78	Shell
68	Scottcjn/power8-projects POWER8 Projects - Ubuntu 22.04 build, PSE LLM, Darwin cross-compile	29	Experimental	24	Shell
69	llmjava/hf_text_generation Hugging Face Text Generation API client for Java	29	Experimental	1	—
70	mordang7/LlamaForge The Ultimate Command Center for Local LLMs. A professional-grade GUI for...	28	Experimental	6	JavaScript
71	AiratTop/ollama-self-hosted A simple Docker Compose setup to self-host Ollama and Open WebUI. Run your...	27	Experimental	2	Shell
72	mdaconta/xlm-eco-api Cross Language Model (LLM/SLM/etc.) Ecosystem API (xlm-eco-api)	27	Experimental	1	Java
73	ggalancs/hfl CLI + API server to download, manage, and run 500K+ HuggingFace models...	26	Experimental	2	Python
74	qianniuspace/movie-detectives-server 骆驼电影侦探社（服务端）	26	Experimental	4	Python
75	nyo16/llama_cpp_ex Elixir bindings for llama.cpp — run LLMs locally with Metal, CUDA, Vulkan,...	26	Experimental	2	Elixir
76	mo-arvan/local-llm docker compose configuration file for running Llama-2 or any other language...	25	Experimental	4	Dockerfile
77	SuppieRK/local-ai-lab Offline-capable, open-source AI home lab notes: practical setups, configs,...	25	Experimental	1	Shell
78	LianHe-BI/Blackwell-optimized-llama.cpp-Docker-image Blackwell-optimized llama.cpp Docker image – works on all NVIDIA GPUs, but...	25	Experimental	4	—
79	Skyluker4/llama-runpod Docker image to run llama.cpp on runpod.io automatically	25	Experimental	1	Shell
80	yokingma/deepseek-vllm Docker&vLLM官方镜像部署DeepSeek模型，在生产环境中提供类OpenAI接口服务。	24	Experimental	15	—
81	ai-action/ollama-github-action-demo 🦙 Demos of large language models (LLMs) with Ollama in GitHub Actions.	23	Experimental	1	—
82	arseniy0924/rpc_manager Web UI for orchestrating distributed llama.cpp RPC GPU clusters with auto...	23	Experimental	2	JavaScript
83	Pavloffm/remote-llm-server Run Ollama in Docker. Share local LLMs across your network. GPU-accelerated.	23	Experimental	1	—
84	alasgarovs/openserv OpenServ is a simple Bash-based CLI tool for managing LLMs in llama.cpp server.	23	Experimental	1	Shell
85	Daaboulex/lmstudio-nix LM Studio packaged for NixOS — local LLM inference desktop app and server	22	Experimental	—	Nix
86	qnianjinri-del/local-llm-recommender 一键识别电脑硬件，推荐最新适配的开源大模型，并支持一键部署。	22	Experimental	—	Python
87	somya-droid/Pirate-LLM-Server Run local LLM servers on iPhone with OpenAI-compatible API, Metal GPU...	22	Experimental	1	Swift
88	rjxby/llama-runtime `llama-runtime` is a high-performance inference server designed for local...	22	Experimental	—	C#
89	EricApgar/llm-server Host an LLM and make it accessible on a network via API.	22	Experimental	—	Python
90	gsavla6-hue/java-llm-integration Comprehensive Java LLM integration library supporting OpenAI, Anthropic and...	22	Experimental	—	Java
91	sebicom/llamacpp4j Java wrapper for llama.cpp	22	Experimental	6	Java
92	byang37/llama-runner A lightweight desktop GUI for llama-server — multi-model routing, per-model...	22	Experimental	—	HTML
93	Logicish/p-lanes A modular wrapper for llama.cpp focused on home-lab scaled hardware,...	22	Experimental	—	Shell
94	sithukyaw007/local-ai-workload Docker-first, local-first AI workload toolkit for macOS Apple Silicon using...	22	Experimental	—	Shell
95	MooNyeu/granitepi-4-nano 🔒 Run a large language model locally on your Raspberry Pi 5 with IBM Granite...	22	Experimental	—	—
96	tdiprima/ollama-orchestrator Self-hosted AI automation: manage Ollama models, deploy Open WebUI in...	22	Experimental	—	Shell
97	clixgvvv/AndroidLLMServerScript 📲 Create a local LLM server on Android using Python and llama.cpp for easy...	22	Experimental	—	Python
98	ebowwa-archive/LLM_telecenter A fastapi wrapper of babca / python-gsmmodem for a waveshare sim7600x. Not...	22	Experimental	6	Python
99	SergiuDeveloper/distributed-llama.cpp Distributed LLM inference across multiple machines. A central server routes...	22	Experimental	1	Go
100	ThomasVitale/llm-images Catalog of OCI images for popular open-source or open Large Language Models.	22	Experimental	16	Dockerfile
101	VityazevEgor/LLMapi4free LLMapi4free provides a unified API for free access to various large Language...	21	Experimental	3	Java
102	gperdrizet/llms-devcontainer Containerized development environment for LLM based projects	21	Experimental	—	Python
103	futursolo/pai Collection of AI Containers - Prebuilt and Ready-to-Use	21	Experimental	—	Dockerfile
104	llmjava/llm4j One API to access Large Language Models in Java	21	Experimental	11	Java
105	zyoung11/lmgo Windows system tray for llama.cpp + ROCm. Optimized for AMD RYZEN AI MAX+...	21	Experimental	—	Go
106	dmeldrum6/Llama-Forge Open source llama.cpp wrapper with server and client	21	Experimental	—	C#
107	nishant-sethi/python-ai-extension-server Python Server to use local LLMs	20	Experimental	5	Python
108	abdulazizalmalki-gh/local-ai A simple, self-hosted stack for running AI models locally using llama.cpp...	19	Experimental	—	Shell
109	sinfallas/llm-local-loader-docker docker compose to load ollama, flowise, langfuse, open-web-ui	19	Experimental	3	—
110	gustavostz/Local-AI-Open-Orca-For-Dummies Local AI Open Orca For Dummies is a user-friendly guide to running Large...	19	Experimental	3	Python
111	thkox/home-ai-server Home AI Server provides the backend infrastructure for the Home AI system....	19	Experimental	4	Python
112	kryoz/llama-strix-halo llama.cpp setup on dedicated AMD Strix Halo machine	18	Experimental	2	—
113	FlorinAndrei/local-inference-docs Run generative AI locally, on your hardware, for coding and other purposes	18	Experimental	10	—
114	merlijn/scala-llm-api Basic OpenAI client for Scala	18	Experimental	2	Scala
115	turtleio/turtle 🐰 shoulda been an app - 🐢	17	Experimental	1	—
116	MrTechyWorker/SmartLLM-Server Implementing a robust client-server architecture from scratch, designed to...	17	Experimental	1	Python
117	cyberguard-ai/local-llm-server A containerized, offline-capable LLM API powered by Ollama. Automatically...	15	Experimental	—	Python
118	stlin256/llama-remote A web-based remote control panel for managing llama.cpp instances. Monitor...	15	Experimental	1	TypeScript
119	phospho-app/fastassert Dockerized LLM inference server with constrained output (JSON mode), built...	15	Experimental	27	Jupyter Notebook
120	abhiFSD/llama.cpp-Monitor-Dashboard ⚡ Real-time monitoring dashboard for llama.cpp server — single HTML file,...	14	Experimental	1	HTML
121	Weebaay/local-ai-homelab Déploiement d'un serveur IA local sur VM Ubuntu Server 24.04 avec Ollama et...	14	Experimental	—	—
122	mendhak/local-llm-workspace Private, secure, containerized LLM environment for chat and coding. Using...	14	Experimental	—	—
123	Riju007/dev-knowledge-vault 🧠 My second brain — hands-on engineering notes on Docker, AI, Python and beyond	14	Experimental	—	—
124	chaserbot/chaseworkslab-llm Self-hosted LLM stack (Ollama, Open WebUI, etc.) for the homelab	14	Experimental	—	—
125	nishantapatil3/litellm-compose Docker Compose setup for LiteLLM proxy server with PostgreSQL and Prometheus...	14	Experimental	—	—
126	57Ajay/model-runner A simple model runner using llama.cpp and huggingface	14	Experimental	1	Go
127	aayes89/JavaRNN-LLM An RNN written in pure Java to compete with Transformers	13	Experimental	—	Java
128	yeeking/llamacpp-minimal-example Minimal example of using llama cpp as library from cpp	13	Experimental	—	C++
129	ai-action/ai-inference-demo AI Inference in GitHub Actions demo	13	Experimental	—	—
130	beeracs/Llama Run Llama models in your web browser using JavaScript and WebAssembly....	13	Experimental	—	HTML
131	SwiftyAI/SwiftyMLC An example of integrating local LLMS using mlc-llm into an iOS app	13	Experimental	9	Swift
132	FarzamMohammadi/self-hosted-ai-stack Blog resources for building a self-hosted AI infrastructure. Contains all...	13	Experimental	—	JavaScript
133	desdeux/llama2odin Llama2.C port in Odin	13	Experimental	—	Odin
134	Doculoom/doculoom-server LLM backed API server	13	Experimental	—	Python
135	wronai/docker-platform Enterprise-grade secure media storage with AI analysis, role-based access,...	13	Experimental	—	Go
136	AntonSHBK/llm_service A FastAPI-based microservice for interacting with LLM (OpenAI API) with...	13	Experimental	—	Python
137	danerlt/llm-server 使用Docker-compose部署大模型服务	11	Experimental	—	—
138	NoroSaroyan/JLLM-Connect Java library for seamless integration with LLM provider	11	Experimental	2	Java
139	jkawamoto/llama-cpp-api OpenAPI specification for the LLama.cpp HTTP Server	11	Experimental	—	—
140	marcosaugustoldo/install-anythingllm-ec2-aws-freetier Learn how to create an Anything LLM container on your AWS instance by...	10	Experimental	2	—
141	siddhant385/ollamaonActions Running Ollama on Github Actions	10	Experimental	2	—

Comparisons in this category

worker-vllm and runpod-worker-oobabooga (61 vs 46)