vitoplantamura/OnnxStream

Lightweight inference library for ONNX files, written in C++. It can run Stable Diffusion XL 1.0 on a RPI Zero 2 (or in 298MB of RAM) but also Mistral 7B on desktops and servers. ARM, x86, WASM, RISC-V supported. Accelerated by XNNPACK. Python, C# and JS(WASM) bindings available.

51
/ 100
Established

This project helps hobbyists and specialized professionals run complex AI models like Stable Diffusion for image generation, large language models (LLMs) for text, or YOLO for object detection on resource-constrained devices like a Raspberry Pi or in web browsers. It takes trained AI models and processes them using very little memory, producing images, text, or detected objects, even on hardware with limited RAM. It's for users who need to deploy advanced AI capabilities efficiently on small, low-power machines or directly within web applications.

2,031 stars.

Use this if you need to run powerful AI models on hardware with very limited memory, such as single-board computers or directly in a web browser, without sacrificing the quality of your AI's output.

Not ideal if you are primarily focused on maximizing inference speed or throughput on high-end hardware, where memory consumption is not a primary concern.

edge-ai embedded-systems image-generation natural-language-processing object-detection
No Package No Dependents
Maintenance 10 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 15 / 25

How are scores calculated?

Stars

2,031

Forks

89

Language

C++

License

Last pushed

Jan 20, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/vitoplantamura/OnnxStream"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.