Agent Monitoring Debugging LLM Tools

Tools for real-time monitoring, tracing, visualization, and debugging of AI agent execution. Includes cost tracking, performance analysis, and decision transparency. Does NOT include general observability platforms, LLM evaluation frameworks, or agent orchestration/build platforms.

There are 28 agent monitoring debugging tools tracked. 1 score above 50 (established tier). The highest-rated is aimclub/OSA at 52/100 with 134 stars.

Get all 28 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=llm-tools&subcategory=agent-monitoring-debugging&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Tool	Score	Tier	Stars	Language
1	aimclub/OSA Tool that just makes your open source project better using LLM agents	52	Established	134	Python
2	arthur-ai/arthur-engine Make AI work for Everyone - Monitoring and governing for your AI/ML	48	Emerging	69	Python
3	typper-io/ai-code-sandbox Secure Python sandbox for AI/ML code execution using Docker. Run LLM outputs safely.	45	Emerging	61	Python
4	GoPlausible/.github Plausible is a proof of anything protocol for Algorand	42	Emerging	3	—
5	samarthaggarwal/always-on-debugger Making terminal debugging 10x faster	40	Emerging	34	Python
6	anilatambharii/argus-ai The world's first AI observability platform that doesn't just alert you- it...	37	Emerging	3	Python
7	sagents-ai/sagents_live_debugger A Phoenix LiveView dashboard for debugging and monitoring Sagents agents in...	36	Emerging	18	Elixir
8	Sigmabrogz/agent-devtools Chrome DevTools for AI Agents - Real-time debugging, pause, inspect, and...	36	Emerging	3	Python
9	quentinducasse/decima DECIMA (Data Extraction & Contextual Inference for MCNP Analysis) — MCNP...	35	Emerging	1	C++
10	antonpk1/stackfish Stackfish is an open-source LLM-powered pipeline designed to automatically...	34	Emerging	53	C++
11	anthalehq/anthale-node Anthale's official TypeScript SDK	34	Emerging	1	TypeScript
12	nionis/near-innovation-sandbox Safe, auditable, and private AI for governments.	28	Experimental	3	TypeScript
13	anthalehq/anthale-openapi Anthale's official OpenAPI specification	25	Experimental	1	—
14	ArkFelix7/agentlens Chrome DevTools for AI Agents — real-time observability, hallucination...	24	Experimental	3	Python
15	SkySingh04/TracePR AI-powered tool that analyzes PRs using AI to provide observability...	24	Experimental	24	Go
16	anthalehq/anthale-examples Curated examples and secure design templates to help you build robust,...	23	Experimental	1	—
17	dittonovenska/rfc-agent-validator 🔍 Explore IETF RFCs effortlessly with this AI-driven agent that validates...	22	Experimental	—	Python
18	kadubon/agent-lifecycle-certification-poc Public, fully local PoCs for counterfactually auditable lifecycle...	22	Experimental	—	Python
19	lukaspfisterch/dbl-gateway Reference implementation of the DBL execution boundary. Deterministic event...	22	Experimental	—	Python
20	nicholasmacaskill/verithra-agentic-proof-public Institutional-grade ZK-compliance infrastructure for financial workflows....	22	Experimental	1	Makefile
21	vorionsys/vorion Open-source AI agent governance - BASIS trust standard, 13,600+ tests, Apache-2.0	21	Experimental	—	TypeScript
22	fabulousengineer0211/VMdebugger Simple, privacy-first debugging tool for AI agents. See WHERE agents fail...	21	Experimental	—	HTML
23	PovedaAqui/cybertrace-ai [DEPRECATED] CybertraceAI is an open-source AI agent that simplifies network...	20	Experimental	6	Python
24	ChizhongWang/OpenVerifAIble VerifAIble - Make every bit of information verifiable to everyone	18	Experimental	13	—
25	Algiras/debugium Multi-language debugger with real-time web UI and LLM integration via MCP	15	Experimental	2	Rust
26	anayak314/governance-orchestrator 🌐 Streamline governance and context engineering for coding agents to ensure...	14	Experimental	—	Python
27	suedehed/agentkernel Runtime safety layer for AI agents. Enforce intent. Intercept actions....	14	Experimental	—	Python
28	nnaulia/dao-governance-portal ⚖️ Enable decentralized voting on proposals with a Governance Token in this...	13	Experimental	—	JavaScript

Comparisons in this category

agent-devtools and agentlens (36 vs 24)