Llm Docker Deployments Transformer Models

There are 20 llm docker deployments models tracked. 2 score above 50 (established tier). The highest-rated is beehive-lab/GPULlama3.java at 51/100 with 238 stars.

Get all 20 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=llm-docker-deployments&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Model	Score	Tier	Stars	Language
1	beehive-lab/GPULlama3.java GPU-accelerated Llama3.java inference in pure Java using TornadoVM.	51	Established	238	Java
2	gitkaz/mlx_gguf_server This is a FastAPI based LLM server. Load multiple LLM models (MLX or...	50	Established	17	Python
3	srgtuszy/llama-cpp-swift Swift bindings for llama-cpp library	44	Emerging	67	Swift
4	JackZeng0208/llama.cpp-android-tutorial llama.cpp tutorial on Android phone	40	Emerging	155	—
5	awinml/llama-cpp-python-bindings Run fast LLM Inference using Llama.cpp in Python	37	Emerging	19	Jupyter Notebook
6	RhinoDevel/mt_llm Pure C wrapper library to use llama.cpp with Linux and Windows as simple as...	36	Emerging	14	C++
7	dougeeai/llama-cpp-python-wheels Pre-built wheels for llama-cpp-python across platforms and CUDA versions	34	Emerging	40	—
8	GURPREETKAURJETHRA/Ollama-UseCases This repo brings numerous use cases from the Open Source Ollama	34	Emerging	4	Python
9	lennartpollvogt/ollama-instructor Python library for the instruction and reliable validation of structured...	33	Emerging	77	Python
10	AbhinaavRamesh/ollama-local-serve Local LLM infrastructure for distributed AI applications. Serve...	32	Emerging	4	Python
11	muhac/llm-actions Run LLMs for inference in GitHub Actions - add to your workflow!	29	Experimental	4	Python
12	rookiemann/vllm-windows-build Native Windows build patches for vLLM v0.14.1 — MSVC 2022 + CUDA 12.6, 26...	26	Experimental	2	Python
13	nicholasyager/llama-cpp-guidance A guidance compatibility layer for llama-cpp-python	26	Experimental	36	Python
14	onidahabitual85/llm-server Launch and optimize llama.cpp servers automatically across Linux, macOS, and...	23	Experimental	1	Shell
15	thansen0/fastllm.cpp A low latency, fault tolerant API for accessing LLM's written in C++ using llama.cpp.	23	Experimental	11	C++
16	rookiemann/llama-cpp-python-py314-cuda131-wheel GPU-accelerated llama-cpp-python 0.3.16 wheel for Python 3.14 (CUDA 13.1, Windows)	21	Experimental	—	—
17	andrewginns/LocalLLM Configurations for a locally hosted LLM and applications leveraging it	21	Experimental	1	Makefile
18	frost-beta/llama2-high-level-cpp Inference Llama2 with High-Level C++.	21	Experimental	11	C
19	abhishekrana/llm-service RESTful service with LLMs (Large Language Models) running locally	17	Experimental	1	Python
20	caiomadeira/llama2-psp Llama 2 inference in C on the PlayStation Portable (PSP).	15	Experimental	20	C++

Comparisons in this category

llama-cpp-python-wheels and llama-cpp-python-py314-cuda131-wheel (34 vs 21)