jpreagan/llmnop

A tool for measuring LLM performance metrics.

35
/ 100
Emerging

This tool helps AI/ML operations engineers and MLOps professionals evaluate the real-world performance of large language models (LLMs) served via API endpoints. You provide details like the API URL, model name, and desired input/output token counts, and it produces detailed metrics on latency (like time to first token) and throughput. This allows you to compare different LLM providers, validate deployments, or optimize serving configurations.

Use this if you need to reliably measure how fast your LLM inference endpoints are responding and generating tokens under various load conditions.

Not ideal if you're a data scientist primarily focused on model accuracy or training, rather than the operational performance of deployed models.

LLM-operations MLOps API-benchmarking inference-performance system-tuning
No Package No Dependents
Maintenance 10 / 25
Adoption 9 / 25
Maturity 16 / 25
Community 0 / 25

How are scores calculated?

Stars

9

Forks

Language

Rust

License

Apache-2.0

Last pushed

Feb 21, 2026

Monthly downloads

38

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/jpreagan/llmnop"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.