g1ibby/llm-deploy
Tool to manage ollama model on vast.ai
This tool helps developers quickly set up and manage large language models (LLMs) like Llama on cloud servers through vast.ai. You provide configuration details for your desired LLMs, and the tool automates their deployment and lifecycle. It's designed for developers who want to experiment with or host LLMs without manual server configuration.
No commits in the last 6 months.
Use this if you are a developer looking for an automated way to deploy and manage Ollama-compatible LLMs on vast.ai for experimentation or hosting.
Not ideal if you prefer a graphical user interface for managing cloud instances or if you're not comfortable with command-line tools and YAML configurations.
Stars
19
Forks
1
Language
Python
License
MIT
Category
Last pushed
Apr 19, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/g1ibby/llm-deploy"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
PaddlePaddle/FastDeploy
High-performance Inference and Deployment Toolkit for LLMs and VLMs based on PaddlePaddle
mlc-ai/mlc-llm
Universal LLM Deployment Engine with ML Compilation
skyzh/tiny-llm
A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny...
ServerlessLLM/ServerlessLLM
Serverless LLM Serving for Everyone.
AXERA-TECH/ax-llm
Explore LLM model deployment based on AXera's AI chips