underneathall/pinferencia

Python + Inference - Model Deployment library in Python. Simplest model inference server ever.

/ 100

Established

This project helps data scientists and machine learning engineers quickly get their machine learning models ready for others to use. You provide your trained model, and it generates an interactive web interface and a programmatic API. This allows other applications or end-users to send data to your model and receive predictions, without needing to understand the underlying code.

545 stars. No commits in the last 6 months. Available on PyPI.

Use this if you need to rapidly deploy a machine learning model, such as for quick prototyping or internal tool development, and want a simple way to give it a user interface and an API.

Not ideal if you require complex enterprise-level model serving features like advanced monitoring, robust security configurations, or highly specialized distributed inference patterns.

model deployment machine learning operations data science workflow ML prototyping API development

Stale 6m

Maintenance 0 / 25

Adoption 10 / 25

Maturity 25 / 25

Community 21 / 25

How are scores calculated?

Stars

545

Forks

Language

Python

License

Apache-2.0

Related models

vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

sgl-project/sglang

SGLang is a high-performance serving framework for large language models and multimodal models.

alibaba/MNN

MNN: A blazing-fast, lightweight inference engine battle-tested by Alibaba, powering...

xorbitsai/inference

Swap GPT for any LLM by changing a single line of code. Xinference lets you run open-source,...

tensorzero/tensorzero

TensorZero is an open-source stack for industrial-grade LLM applications. It unifies an LLM...

Explore Transformer Models

All categories Trending Transformer directory Insights