fboulnois/llama-cpp-docker

Run llama.cpp in a GPU accelerated Docker container

/ 100

Emerging

This helps developers quickly set up and run local large language models (LLMs) on their own hardware. You input model names from HuggingFace, and it provides a local chat server running that model, accessible via a web browser. This tool is for software developers who want to integrate or experiment with LLMs without relying on cloud services.

Use this if you are a developer looking to host and interact with open-source LLMs locally on a GPU-accelerated server for testing or application development.

Not ideal if you are an end-user without programming experience or specific developer needs, as it requires comfort with command-line tools and Docker.

local-LLM-deployment GPU-acceleration model-serving AI-application-development containerization

No Package No Dependents

Maintenance 6 / 25

Adoption 8 / 25

Maturity 16 / 25

Community 18 / 25

How are scores calculated?

Stars

Forks

Language

Dockerfile

License

MIT

Higher-rated alternatives

ludwig-ai/ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models

withcatai/node-llama-cpp

Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema...

mudler/LocalAI

:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and...

zhudotexe/kani

kani (カニ) is a highly hackable microframework for tool-calling language models. (NLP-OSS @ EMNLP 2023)

SciSharp/LLamaSharp

A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.

Explore Transformer Models

All categories Trending Transformer directory Insights