brontoguana/krasis

Krasis is a Hybrid LLM runtime which focuses on efficient running of larger models on consumer grade VRAM limited hardware

/ 100

Emerging

Krasis helps AI practitioners run very large language models (LLMs) — those too big for typical consumer graphics cards — on their existing hardware. You provide the large LLM files, and Krasis processes them to allow efficient text generation and understanding on a single or a few NVIDIA GPUs. This is for AI developers, researchers, and data scientists who want to experiment with or deploy state-of-the-art LLMs without needing expensive, specialized server infrastructure.

Use this if you need to run massive language models (like 80B to 200B+ parameters) on a single consumer-grade NVIDIA GPU or a small cluster, and you want excellent performance without sacrificing too much quality.

Not ideal if you don't have an NVIDIA GPU, or if your primary need is to run smaller models that already fit comfortably in your GPU's VRAM.

Large-Language-Models AI-inference GPU-optimization Machine-Learning-deployment AI-research

No Package No Dependents

Maintenance 10 / 25

Adoption 8 / 25

Maturity 11 / 25

Community 12 / 25

How are scores calculated?

Stars

Forks

Language

Rust

License

—

Higher-rated alternatives

EricLBuehler/mistral.rs

Fast, flexible LLM inference

nerdai/llms-from-scratch-rs

A comprehensive Rust translation of the code from Sebastian Raschka's Build an LLM from Scratch book.

ShelbyJenkins/llm_utils

llm_utils: Basic LLM tools, best practices, and minimal abstraction.

Mattbusel/llm-wasm

LLM inference primitives for WebAssembly — cache, retry, routing, guards, cost tracking, templates

GoWtEm/llm-model-selector

A high-performance Rust utility that analyzes your system hardware to recommend the optimal LLM...

Explore Transformer Models

All categories Trending Transformer directory Insights