ag2ai/Agents_Failure_Attribution
Benchmark for automated failure attributions in agentic systems (🏆 ICML 2025 Spotlight)
This project helps developers of multi-agent systems quickly pinpoint why their AI agents fail. You input a log of a failed multi-agent task, and it identifies which specific agent and step caused the failure, along with a natural language explanation. This is primarily for AI system developers, researchers, and engineers working with large language model-based multi-agent architectures.
349 stars.
Use this if you are building or debugging complex multi-agent AI systems and need to automate the process of identifying the root cause of task failures.
Not ideal if you are not working with multi-agent systems or primarily debugging single-agent AI models, as this tool is specifically designed for inter-agent failure attribution.
Stars
349
Forks
23
Language
Python
License
MIT
Category
Last pushed
Feb 11, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/agents/ag2ai/Agents_Failure_Attribution"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Featured in
Higher-rated alternatives
StonyBrookNLP/appworld
🌍 AppWorld: A Controllable World of Apps and People for Benchmarking Function Calling and...
qualifire-dev/rogue
AI Agent Evaluator & Red Team Platform
microsoft/WindowsAgentArena
Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of...
future-agi/ai-evaluation
Evaluation Framework for all your AI related Workflows
RouteWorks/RouterArena
RouterArena: An open framework for evaluating LLM routers with standardized datasets, metrics,...