OpenLMLab/MOSS_Vortex
Moss Vortex is a lightweight and high-performance deployment and inference backend engineered specifically for MOSS 003, providing a wealth of features aimed at enhancing performance and functionality, built upon the foundations of MOSEC and Torch.
This project helps AI developers and researchers quickly set up and run the MOSS 003 language model on their own GPU servers. It takes user prompts as input and efficiently streams back AI-generated text responses, supporting continuous conversations and custom tool integrations. The end-user persona is a machine learning engineer or AI practitioner responsible for deploying and managing large language models.
No commits in the last 6 months.
Use this if you need a fast and lightweight server to deploy the MOSS 003 model for real-time text generation and interactive AI applications.
Not ideal if your application heavily relies on advanced token batching for complex LLM reasoning tasks, as this feature is not yet fully implemented.
Stars
37
Forks
9
Language
Python
License
AGPL-3.0
Category
Last pushed
Apr 25, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/OpenLMLab/MOSS_Vortex"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
jundot/omlx
LLM inference server with continuous batching & SSD caching for Apple Silicon — managed from the...
josStorer/RWKV-Runner
A RWKV management and startup tool, full automation, only 8MB. And provides an interface...
waybarrios/vllm-mlx
OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models...
jordanhubbard/nanolang
A tiny experimental language designed to be targeted by coding LLMs
akivasolutions/tightwad
Pool your CUDA + ROCm GPUs into one OpenAI-compatible API. Speculative decoding proxy gives you...