modelscope/MCPBench

The evaluation benchmark on MCP servers

/ 100

Emerging

This tool helps developers and researchers evaluate the performance of different Large Language Model (LLM) agent servers, such as those used for web search or database queries. You input configuration details for the MCP servers you want to test and the framework outputs metrics like task completion accuracy, latency, and token consumption. It's designed for AI practitioners who build or utilize LLM-powered applications and need to compare server effectiveness.

241 stars. No commits in the last 6 months.

Use this if you are an AI developer or researcher needing to benchmark and compare the performance of various LLM agent servers (MCP Servers) for tasks like web search or database querying.

Not ideal if you are an end-user looking for an LLM agent solution; this tool is for evaluating the underlying servers, not for direct use of the agents themselves.

LLM evaluation AI agent benchmarking natural language processing web search engineering database query optimization

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 11 / 25

How are scores calculated?

Stars

241

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

SonarSource/sonarqube-mcp-server

SonarQube MCP Server

cqfn/aibolit-mcp-server

MCP Server for Aibolit Java Static Analyzer: Helping Your AI Agent Identify Hotspots for Refactoring

mitulgarg/env-doctor

Debug your GPU, CUDA, and AI stacks across local, Docker, and CI/CD (CLI and MCP server)

helixml/kodit

👩‍💻 MCP server to index external repositories

MarcusJellinghaus/mcp-tools-py

MCP server providing code quality checks (pylint and pytest) with smart LLM-friendly prompts for...

Explore MCP Servers

All categories Trending MCP Server directory Insights