vitoplantamura/OnnxStream

Lightweight inference library for ONNX files, written in C++. It can run Stable Diffusion XL 1.0 on a RPI Zero 2 (or in 298MB of RAM) but also Mistral 7B on desktops and servers. ARM, x86, WASM, RISC-V supported. Accelerated by XNNPACK. Python, C# and JS(WASM) bindings available.

/ 100

Established

This project helps hobbyists and specialized professionals run complex AI models like Stable Diffusion for image generation, large language models (LLMs) for text, or YOLO for object detection on resource-constrained devices like a Raspberry Pi or in web browsers. It takes trained AI models and processes them using very little memory, producing images, text, or detected objects, even on hardware with limited RAM. It's for users who need to deploy advanced AI capabilities efficiently on small, low-power machines or directly within web applications.

2,031 stars.

Use this if you need to run powerful AI models on hardware with very limited memory, such as single-board computers or directly in a web browser, without sacrificing the quality of your AI's output.

Not ideal if you are primarily focused on maximizing inference speed or throughput on high-end hardware, where memory consumption is not a primary concern.

edge-ai embedded-systems image-generation natural-language-processing object-detection

No Package No Dependents

Maintenance 10 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 15 / 25

How are scores calculated?

Stars

2,031

Forks

Language

C++

License

—

Related models

vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

sgl-project/sglang

SGLang is a high-performance serving framework for large language models and multimodal models.

alibaba/MNN

MNN: A blazing-fast, lightweight inference engine battle-tested by Alibaba, powering...

xorbitsai/inference

Swap GPT for any LLM by changing a single line of code. Xinference lets you run open-source,...

tensorzero/tensorzero

TensorZero is an open-source stack for industrial-grade LLM applications. It unifies an LLM...

Explore Transformer Models

All categories Trending Transformer directory Insights