brontoguana/krasis

Krasis is a Hybrid LLM runtime which focuses on efficient running of larger models on consumer grade VRAM limited hardware

41
/ 100
Emerging

Krasis helps AI practitioners run very large language models (LLMs) — those too big for typical consumer graphics cards — on their existing hardware. You provide the large LLM files, and Krasis processes them to allow efficient text generation and understanding on a single or a few NVIDIA GPUs. This is for AI developers, researchers, and data scientists who want to experiment with or deploy state-of-the-art LLMs without needing expensive, specialized server infrastructure.

Use this if you need to run massive language models (like 80B to 200B+ parameters) on a single consumer-grade NVIDIA GPU or a small cluster, and you want excellent performance without sacrificing too much quality.

Not ideal if you don't have an NVIDIA GPU, or if your primary need is to run smaller models that already fit comfortably in your GPU's VRAM.

Large-Language-Models AI-inference GPU-optimization Machine-Learning-deployment AI-research
No Package No Dependents
Maintenance 10 / 25
Adoption 8 / 25
Maturity 11 / 25
Community 12 / 25

How are scores calculated?

Stars

52

Forks

6

Language

Rust

License

Last pushed

Mar 11, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/brontoguana/krasis"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.