autonomi-ai/nos
⚡️ A fast and flexible PyTorch inference server that runs locally, on any cloud or AI HW.
This project helps machine learning engineers and MLOps professionals quickly deploy and manage various AI models like large language models, image generators, and audio transcription tools. You provide the model and data (text, images, audio), and it outputs the processed results from the AI model, ready for your applications. It's designed for teams building AI-powered products that need efficient model serving.
147 stars. No commits in the last 6 months.
Use this if you need a flexible and performant way to serve multiple PyTorch AI models, including LLMs, diffusion models, and more, across different cloud environments or hardware.
Not ideal if you are looking for a no-code solution or primarily work with non-PyTorch machine learning frameworks.
Stars
147
Forks
12
Language
Python
License
Apache-2.0
Category
Last pushed
Jun 08, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/generative-ai/autonomi-ai/nos"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Featured in
Higher-rated alternatives
openvinotoolkit/model_server
A scalable inference server for models optimized with OpenVINO™
madroidmaq/mlx-omni-server
MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically...
NVIDIA-NeMo/Guardrails
NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based...
generative-computing/mellea
Mellea is a library for writing generative programs.
rhesis-ai/rhesis
Open-source platform & SDK for testing LLM and agentic apps. Define expected behavior, generate...