argonne-lcf/LLM-Inference-Bench

LLM-Inference-Bench

38
/ 100
Emerging

This tool helps AI researchers and system architects understand how different large language models perform on various AI hardware accelerators. You input information about the LLM (like LLaMA or Mistral) and the hardware platform you're considering (like Nvidia GPUs, AMD GPUs, or Intel Habana), and it provides detailed performance metrics to help you select the most efficient configuration. It's designed for those who need to optimize the computational demands of deploying LLMs for text generation applications.

No commits in the last 6 months.

Use this if you need to determine the optimal combination of a large language model, inference framework, and hardware accelerator to achieve the best performance and scalability for your AI applications.

Not ideal if you are an end-user of an LLM application and don't manage the underlying hardware or software infrastructure.

AI-infrastructure LLM-deployment hardware-evaluation performance-optimization AI-research
Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 8 / 25
Maturity 16 / 25
Community 12 / 25

How are scores calculated?

Stars

60

Forks

7

Language

Jupyter Notebook

License

BSD-3-Clause

Last pushed

Jul 18, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/argonne-lcf/LLM-Inference-Bench"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.