EricLBuehler/mistral.rs

Fast, flexible LLM inference

/ 100

Established

Mistral.rs helps you efficiently run large language models (LLMs) on your own computer or server. You input a model, text prompts, and optional images or audio, and it outputs generated text, descriptions, or even new images. This tool is for developers, researchers, and engineers who need to deploy and interact with powerful AI models directly within their applications.

6,681 stars. Actively maintained with 32 commits in the last 30 days.

Use this if you need to integrate diverse multimodal AI capabilities (text, image, audio, video) into your applications with high performance and full control over model quantization.

Not ideal if you're looking for a simple, no-code AI chat interface or don't have programming experience to integrate an SDK or use a CLI.

LLM deployment AI inference multimodal AI machine learning engineering AI application development

No Package No Dependents

Maintenance 20 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 19 / 25

How are scores calculated?

Stars

6,681

Forks

540

Language

Rust

License

MIT

Recent Releases

v0.8.0 02 Apr 2026 v0.7.0 28 Jan 2026 v0.6.0 10 Jun 2025 v0.5.0 24 Mar 2025 v0.4.0 22 Jan 2025

Related models

nerdai/llms-from-scratch-rs

A comprehensive Rust translation of the code from Sebastian Raschka's Build an LLM from Scratch book.

brontoguana/krasis

Krasis is a Hybrid LLM runtime which focuses on efficient running of larger models on consumer...

ShelbyJenkins/llm_utils

llm_utils: Basic LLM tools, best practices, and minimal abstraction.

Mattbusel/llm-wasm

LLM inference primitives for WebAssembly — cache, retry, routing, guards, cost tracking, templates

GoWtEm/llm-model-selector

A high-performance Rust utility that analyzes your system hardware to recommend the optimal LLM...

Explore Transformer Models

All categories Trending Transformer directory Insights