RayFernando1337/LLM-Calc

Instantly calculate the maximum size of quantized language models that can fit in your available RAM, helping you optimize your models for inference.

/ 100

Emerging

This tool helps you quickly figure out the largest large language model (LLM) that will actually run on your computer's memory. You tell it your available RAM, how much your operating system uses, and the desired quality level of the model, and it tells you the maximum model size (in parameters) you can use. This is for AI developers, researchers, or anyone building or experimenting with LLMs on their own hardware.

253 stars.

Use this if you need to determine the optimal size of a quantized LLM that can fit into your GPU or system RAM for efficient inference or local development.

Not ideal if you are looking for a tool to train LLMs or fine-tune them, or if you are interested in cloud-based LLM deployment where memory management is abstracted away.

LLM deployment model inference AI hardware optimization local LLM development quantization planning

No Package No Dependents

Maintenance 10 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 10 / 25

How are scores calculated?

Stars

253

Forks

Language

TypeScript

License

MIT

Higher-rated alternatives

vllm-project/vllm-ascend

Community maintained hardware plugin for vLLM on Ascend

SemiAnalysisAI/InferenceX

Open Source Continuous Inference Benchmarking Qwen3.5, DeepSeek, GPTOSS - GB200 NVL72 vs MI355X...

kvcache-ai/Mooncake

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

uccl-project/uccl

UCCL is an efficient communication library for GPUs, covering collectives, P2P (e.g., KV cache...

sophgo/tpu-mlir

Machine learning compiler based on MLIR for Sophgo TPU.

Explore LLM Tools

All categories Trending LLM Tool directory Insights