autonomi-ai/nos

⚡️ A fast and flexible PyTorch inference server that runs locally, on any cloud or AI HW.

/ 100

Emerging

This project helps machine learning engineers and MLOps professionals quickly deploy and manage various AI models like large language models, image generators, and audio transcription tools. You provide the model and data (text, images, audio), and it outputs the processed results from the AI model, ready for your applications. It's designed for teams building AI-powered products that need efficient model serving.

147 stars. No commits in the last 6 months.

Use this if you need a flexible and performant way to serve multiple PyTorch AI models, including LLMs, diffusion models, and more, across different cloud environments or hardware.

Not ideal if you are looking for a no-code solution or primarily work with non-PyTorch machine learning frameworks.

AI-model-deployment MLOps real-time-inference LLM-serving computer-vision-deployment

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 12 / 25

How are scores calculated?

Stars

147

Forks

Language

Python

License

Apache-2.0

Featured in

You're Shipping AI You Can't Measure

Higher-rated alternatives

openvinotoolkit/model_server

A scalable inference server for models optimized with OpenVINO™

madroidmaq/mlx-omni-server

MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically...

NVIDIA-NeMo/Guardrails

NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based...

generative-computing/mellea

Mellea is a library for writing generative programs.

rhesis-ai/rhesis

Open-source platform & SDK for testing LLM and agentic apps. Define expected behavior, generate...

Explore Generative AI Tools

All categories Trending Generative AI directory Insights