aws-samples/foundation-model-benchmarking-tool
Foundation model benchmarking tool. Run any model on any AWS platform and benchmark for performance across instance type and serving stack options.
This tool helps machine learning engineers and researchers evaluate the performance and accuracy of large language models (LLMs) and other foundation models. You provide your chosen models and deployment configurations on AWS, and it outputs detailed benchmarks on inference latency, throughput, cost-performance, and model accuracy. This is designed for anyone building and deploying generative AI applications who needs to choose the best model and infrastructure.
255 stars. No commits in the last 6 months.
Use this if you need to determine the optimal foundation model and AWS serving stack (like SageMaker or Bedrock) for your generative AI application, considering both performance and cost.
Not ideal if you are not deploying or evaluating foundation models on AWS, or if you only need a quick, informal check of model outputs without detailed metrics.
Stars
255
Forks
44
Language
Jupyter Notebook
License
MIT-0
Category
Last pushed
Apr 11, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/generative-ai/aws-samples/foundation-model-benchmarking-tool"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Featured in
Higher-rated alternatives
openvinotoolkit/model_server
A scalable inference server for models optimized with OpenVINO™
madroidmaq/mlx-omni-server
MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically...
NVIDIA-NeMo/Guardrails
NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based...
generative-computing/mellea
Mellea is a library for writing generative programs.
rhesis-ai/rhesis
Open-source platform & SDK for testing LLM and agentic apps. Define expected behavior, generate...