RahulSChand/gpu_poor

Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization

34
/ 100
Emerging

This tool helps you estimate if your GPU can handle a specific Large Language Model (LLM) and how fast it will process text. You input details about the LLM, your GPU, and desired settings, and it tells you the required GPU memory and the approximate tokens generated per second. It's designed for anyone working with LLMs who needs to understand hardware constraints for running or fine-tuning these models.

1,396 stars. No commits in the last 6 months.

Use this if you need to quickly determine the GPU memory and processing speed for a large language model before attempting to run or fine-tune it.

Not ideal if you need perfectly precise, real-time measurements, as the tool provides estimations rather than exact, live performance metrics.

LLM deployment GPU resource planning Model fine-tuning AI model capacity Generative AI hardware
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 8 / 25
Community 16 / 25

How are scores calculated?

Stars

1,396

Forks

87

Language

JavaScript

License

Last pushed

Dec 03, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/RahulSChand/gpu_poor"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.