microsoft/OpenRCA

[ICLR'25] OpenRCA: Can Large Language Models Locate the Root Cause of Software Failures?

/ 100

Established

This project helps software reliability engineers and site reliability engineers diagnose why complex software systems fail. It takes in various telemetry data like KPI time series, dependency graphs, and log files, along with a natural language description of a problem, to pinpoint the exact root cause of a software issue. The output helps these engineers quickly understand and resolve incidents in their operational software.

292 stars.

Use this if you are developing or evaluating AI models specifically designed to perform root cause analysis for software system failures and need a comprehensive benchmark with diverse telemetry data.

Not ideal if you are a non-developer seeking a plug-and-play solution for real-time incident resolution without building or integrating AI models.

site-reliability-engineering incident-management software-diagnostics system-monitoring AI-model-evaluation

No Package No Dependents

Maintenance 10 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 17 / 25

How are scores calculated?

Stars

292

Forks

Language

Python

License

MIT

Compare

OpenRCA and rca-llm

Related tools

PacificAI/langtest

Deliver safe & effective language models

Babelscape/ALERT

Official repository for the paper "ALERT: A Comprehensive Benchmark for Assessing Large Language...

TrustGen/TrustEval-toolkit

[ICLR'26, NAACL'25 Demo] Toolkit & Benchmark for evaluating the trustworthiness of generative...

ChenWu98/agent-attack

[ICLR 2025] Dissecting adversarial robustness of multimodal language model agents

Trust4AI/ASTRAL

Automated Safety Testing of Large Language Models

Explore LLM Tools

All categories Trending LLM Tool directory Insights