aws-samples/foundation-model-benchmarking-tool

Foundation model benchmarking tool. Run any model on any AWS platform and benchmark for performance across instance type and serving stack options.

46
/ 100
Emerging

This tool helps machine learning engineers and researchers evaluate the performance and accuracy of large language models (LLMs) and other foundation models. You provide your chosen models and deployment configurations on AWS, and it outputs detailed benchmarks on inference latency, throughput, cost-performance, and model accuracy. This is designed for anyone building and deploying generative AI applications who needs to choose the best model and infrastructure.

255 stars. No commits in the last 6 months.

Use this if you need to determine the optimal foundation model and AWS serving stack (like SageMaker or Bedrock) for your generative AI application, considering both performance and cost.

Not ideal if you are not deploying or evaluating foundation models on AWS, or if you only need a quick, informal check of model outputs without detailed metrics.

generative-AI-deployment LLM-evaluation cloud-resource-optimization machine-learning-operations model-selection
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 20 / 25

How are scores calculated?

Stars

255

Forks

44

Language

Jupyter Notebook

License

MIT-0

Last pushed

Apr 11, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/generative-ai/aws-samples/foundation-model-benchmarking-tool"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.