cloudmercato/ollama-benchmark

Handy tool to measure the performance and efficiency of LLMs workloads.

/ 100

Emerging

This tool helps AI engineers and researchers assess how well their Ollama-hosted large language models (LLMs) are performing. It takes various LLM models and test parameters as input and outputs detailed performance metrics like response speed, embedding generation time, and even the quality of answers. You can use it to compare different models or optimize a single model's setup for specific tasks.

No commits in the last 6 months.

Use this if you need to systematically measure and compare the speed, efficiency, and quality of different LLM configurations running on Ollama.

Not ideal if you are looking for a tool to deploy or manage your LLMs, or if you need to benchmark models hosted on platforms other than Ollama.

LLM-benchmarking model-evaluation AI-performance natural-language-processing generative-AI

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 9 / 25

Maturity 16 / 25

Community 12 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Compare

ollama-benchmark and llmBench

Higher-rated alternatives

stanfordnlp/axbench

Stanford NLP Python library for benchmarking the utility of LLM interpretability methods

aidatatools/ollama-benchmark

LLM Benchmark for Throughput via Ollama (Local LLMs)

LarHope/ollama-benchmark

Ollama based Benchmark with detail I/O token per second. Python with Deepseek R1 example.

qcri/LLMeBench

Benchmarking Large Language Models

THUDM/LongBench

LongBench v2 and LongBench (ACL 25'&24')

Explore Transformer Models

All categories Trending Transformer directory Insights