Chen-zexi/vllm-cli

A command-line interface tool for serving LLM using vLLM.

/ 100

Established

This tool helps developers and ML engineers efficiently serve large language models (LLMs) on their own hardware. It takes your local LLM files (like those from HuggingFace or Ollama) and turns them into a high-performance, accessible service, either interactively or via command-line. ML practitioners who need to deploy and manage LLMs for applications or testing will find this useful.

482 stars. Available on PyPI.

Use this if you need to serve one or more large language models from your own GPU-equipped machine with optimized performance and easy management.

Not ideal if you primarily use cloud-based LLM APIs or don't have access to CUDA-compatible GPU hardware.

LLM deployment model serving ML infrastructure GPU optimization local AI

Maintenance 10 / 25

Adoption 10 / 25

Maturity 24 / 25

Community 13 / 25

How are scores calculated?

Stars

482

Forks

Language

Python

License

MIT

Related tools

AlexsJones/llmfit

Hundreds of models & providers. One command to find what runs on your hardware.

victordibia/llmx

An API for Chat Fine-Tuned Large Language Models (llm)

InftyAI/llmaz

☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!

livehl/aimirror

🚀 200倍速！AI时代的下载神器 | Docker/PyPI/HuggingFace/CRAN 全加速 | 并行分片+智能缓存，让下载飞起来

TakatoHonda/sui-lang

粋 (Sui) - A programming language optimized for LLM code generation

Explore LLM Tools

All categories Trending LLM Tool directory Insights