aws-samples/foundation-model-benchmarking-tool

Foundation model benchmarking tool. Run any model on any AWS platform and benchmark for performance across instance type and serving stack options.

/ 100

Emerging

This tool helps machine learning engineers and researchers evaluate the performance and accuracy of large language models (LLMs) and other foundation models. You provide your chosen models and deployment configurations on AWS, and it outputs detailed benchmarks on inference latency, throughput, cost-performance, and model accuracy. This is designed for anyone building and deploying generative AI applications who needs to choose the best model and infrastructure.

255 stars. No commits in the last 6 months.

Use this if you need to determine the optimal foundation model and AWS serving stack (like SageMaker or Bedrock) for your generative AI application, considering both performance and cost.

Not ideal if you are not deploying or evaluating foundation models on AWS, or if you only need a quick, informal check of model outputs without detailed metrics.

generative-AI-deployment LLM-evaluation cloud-resource-optimization machine-learning-operations model-selection

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 20 / 25

How are scores calculated?

Stars

255

Forks

Language

Jupyter Notebook

License

MIT-0

Featured in

You're Shipping AI You Can't Measure

Higher-rated alternatives

openvinotoolkit/model_server

A scalable inference server for models optimized with OpenVINO™

madroidmaq/mlx-omni-server

MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically...

NVIDIA-NeMo/Guardrails

NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based...

generative-computing/mellea

Mellea is a library for writing generative programs.

rhesis-ai/rhesis

Open-source platform & SDK for testing LLM and agentic apps. Define expected behavior, generate...

Explore Generative AI Tools

All categories Trending Generative AI directory Insights