ChiefGyk3D/FrankenLLM

Stitched-together GPUs, but it lives! Run different LLM models optimally across multiple NVIDIA GPUs

36
/ 100
Emerging

Maximize the use of your NVIDIA GPUs by running multiple large language models (LLMs) simultaneously, even if your GPUs have different memory capacities. This project allows you to input various LLM models and assign them to specific GPUs, providing optimized, independent model serving on each. It's designed for anyone managing dedicated LLM servers or multi-GPU home lab machines who needs to serve different models efficiently.

Use this if you have multiple NVIDIA GPUs and want to run different LLM models on each of them concurrently, ensuring maximum hardware utilization and zero interference between models.

Not ideal if you only have a single GPU or if your primary need is to run a single, extremely large model that spans across multiple GPUs.

LLM-deployment AI-inference GPU-resource-management local-AI-serving AI-model-hosting
No Package No Dependents
Maintenance 10 / 25
Adoption 5 / 25
Maturity 13 / 25
Community 8 / 25

How are scores calculated?

Stars

9

Forks

1

Language

Shell

License

Last pushed

Feb 05, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/ChiefGyk3D/FrankenLLM"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.