mcp-tool-bench/MCPToolBenchPP

MCPToolBench++ MCP Model Context Protocol Tool Use Benchmark on AI Agent and Model Tool Use Ability

37
/ 100
Emerging

This benchmark helps evaluate how well AI agents can use various tools to complete tasks. You provide a task description, and it assesses the agent's ability to use tools like web browsers, file systems, search engines, maps, and payment systems to produce an outcome. It's designed for AI researchers and developers who are building or selecting AI agents and need to rigorously test their practical problem-solving capabilities.

Use this if you are developing AI agents and need a standardized way to measure their performance when interacting with a wide range of real-world tools and services.

Not ideal if you are an end-user looking for a ready-to-use AI agent to solve a specific business problem, as this is a developer-focused evaluation tool.

AI-agent-evaluation tool-use-benchmarking AI-model-assessment agent-development practical-AI-testing
No License No Package No Dependents
Maintenance 6 / 25
Adoption 7 / 25
Maturity 7 / 25
Community 17 / 25

How are scores calculated?

Stars

41

Forks

8

Language

Python

License

Last pushed

Dec 17, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/mcp/mcp-tool-bench/MCPToolBenchPP"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.