LLM Observability & Monitoring LLM Tools

Tools for observing, tracing, monitoring, and evaluating LLM applications in production. Includes metrics collection, span tracking, performance analysis, and system health dashboards. Does NOT include LLM serving infrastructure, prompt management, or general application logging.

There are 73 llm observability & monitoring tools tracked. 3 score above 70 (verified tier). The highest-rated is Arize-ai/openinference at 73/100 with 886 stars. 5 of the top 10 are actively maintained.

Get all 73 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=llm-tools&subcategory=llm-observability-monitoring&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Tool	Score	Tier	Stars	Language
1	Arize-ai/openinference OpenTelemetry Instrumentation for AI Observability	73	Verified	886	Python
2	vndee/llm-sandbox Lightweight and portable LLM sandbox runtime (code interpreter) Python library.	72	Verified	925	Python
3	apache/hertzbeat An AI-powered next-generation open source real-time observability system.	70	Verified	7,121	Java
4	traceloop/openllmetry Open-source observability for your GenAI or LLM application, based on OpenTelemetry	68	Established	6,906	Python
5	utkuozdemir/nvidia_gpu_exporter Nvidia GPU exporter for prometheus using nvidia-smi binary	59	Established	1,427	Go
6	Dynatrace/obslab-llm-observability Search for a holiday and get destination advice from an LLM. Observability...	56	Established	11	HTML
7	Scale3-Labs/langtrace-python-sdk Langtrace SDK for Python Applications	55	Established	45	Python
8	secretflow/trustflow A privacy-preserving computing system based on TEE.	52	Established	33	C++
9	Ablustrund/MPLSandbox MPLSandbox is an out-of-the-box multi-programming language sandbox designed...	46	Emerging	180	Python
10	openlit/website Open Source OpenTelemetry-native Observability tool for GenAI and LLMs...	45	Emerging	8	TypeScript
11	opensearch-project/observability-stack Opensearch Observability Stack	45	Emerging	6	Python
12	onvo-ai/loghead Loghead is a tool that allows LLMs in your vibe coding tool to have access...	44	Emerging	16	TypeScript
13	clay-good/proxilion-grc Proxilion GRC is a zero-configuration network-layer MITM proxy that secures...	41	Emerging	3	TypeScript
14	HuckleR2003/PC_Workman_HCK Real-time system monitor that explains WHY your PC is slow, not just that...	40	Emerging	20	Python
15	cchinchilla-dev/agentloom Deterministic LLM workflow orchestration with native observability,...	40	Emerging	3	Python
16	mazen160/llmquery Powerful LLM Query Framework with YAML Prompt Templates. Made for Automation	40	Emerging	34	Python
17	chaitanyya/lookout Track, analyze, and improve what LLMs are saying	40	Emerging	56	TypeScript
18	Scale3-Labs/langtrace-typescript-sdk Langtrace SDK for NodeJS Applications	38	Emerging	5	TypeScript
19	jmamda/OpenTrace A local reverse proxy that records every LLM request/response to SQLite. No...	38	Emerging	7	Rust
20	prajeesh-chavan/OpenLLM-Monitor OpenLLM Monitor is a plug-and-play, real-time observability dashboard for...	38	Emerging	20	JavaScript
21	aimusubi/aimusubi Local-first agentic NetOps framework that connects LLMs to real network...	37	Emerging	6	Shell
22	langfuse/oss-llmops-stack Modular, open source LLMOps stack that separates concerns: LiteLLM unifies...	36	Emerging	135	—
23	mxcrafts/ltrack Security Observability Framework for ML/AI Model File Loading	36	Emerging	42	C
24	ZacAttack/HeapDumpStarDiver Allows for fast parsing of an HPROF file to parquet format so that it can be...	35	Emerging	6	Java
25	demml/scouter Monitoring, Evaluation and Observability for AI Applications	35	Emerging	11	Rust
26	eunomia-bpf/ebpf-knowledge-base An ebpf knowledge base, based on llama_index and bpf-developer-tutorial	34	Emerging	10	Rust
27	copyleftdev/robin-smesh 🕸️ Decentralized Dark Web OSINT Framework \| Rust \| SMESH Signal Diffusion \|...	34	Emerging	1	Rust
28	mithril-security/blind_llama_client Zero-trust AI APIs for easy and private consumption of open-source LLMs	33	Emerging	41	Python
29	raaihank/llm-sentinel Privacy-first proxy that automatically detects and masks sensitive data...	30	Emerging	1	Go
30	sarva-20/LLM-Observability-FOSS 🧠 Learn LLM Observability step-by-step using FOSS tools. From zero...	29	Experimental	41	Python
31	Blastgits/traceway Traceway: observability for LLM's	27	Experimental	1	Svelte
32	eullm/eullm Open-source platform for creating, distributing and running sovereign...	27	Experimental	10	Rust
33	ftaghiyev/firewall-configuration-interface A Natural Languange Interface for Firewall Configuration	27	Experimental	5	TypeScript
34	JehanneDussert/govllm Production-grade LLM observability stack - sovereign, GDPR-compliant, open source.	27	Experimental	3	Vue
35	thoughtbot/opentelemetry-instrumentation-ruby_llm OpenTelemetry instrumentation for RubyLLM. 💬🔭	26	Experimental	14	Ruby
36	eduardoslonski/telescope Scalable high-performance async RL post-training framework for LLMs with...	26	Experimental	2	Python
37	broomva/vigil OpenTelemetry-native observability — GenAI semantic conventions,...	25	Experimental	1	Rust
38	tied-inc/eval-track LLM-ML-Observability Toolkits and Serivces	25	Experimental	2	TypeScript
39	james-martinez/lemonade-dashboard A management dashboard for Lemonade Server. This extension provides a visual...	25	Experimental	1	TypeScript
40	saluca-labs/tiresias-core Tiresias core library — shared primitives for the multi-provider LLM proxy...	25	Experimental	1	Python
41	kingfs/llm-tracelab A proxy-based tool for tracing, recording, and replaying LLM API requests.	24	Experimental	1	Go
42	jwilger/union_square Wire-tap your application's LLM interactions for performance analysis and...	24	Experimental	1	Rust
43	Pranavh-2004/DevMonitor DevMonitor is a real-time developer dashboard that aggregates AI research,...	24	Experimental	—	TypeScript
44	JehanneDussert/llm_governance_monitoring Production-grade LLM observability stack - sovereign, GDPR-compliant, open source.	23	Experimental	1	Vue
45	andrewn6/traceway Traceway: observability for LLM's	23	Experimental	1	Svelte
46	DiogoRibeiro7/llm-observability-analytics Observability and analytics layer for LLM systems, capturing...	22	Experimental	—	Python
47	Skobyn/llm-output-governance A practical Python toolkit for evaluating, monitoring, and governing LLM...	22	Experimental	—	Python
48	yeahns278/lemonade-dashboard Manage Lemonade Server models, backends, and settings within VS Code using a...	22	Experimental	—	TypeScript
49	GenesisClawbot/llm-drift LLM drift detector — know within 5 min when GPT-4o, Claude, or Gemini...	22	Experimental	—	Python
50	MoebiusX/KrystalineX KrystalineX — Institutional-grade crypto exchange demo platform with...	22	Experimental	—	TypeScript
51	AdametherzLab/agent-drift-watch CLI that snapshots LLM prompt/response pairs and alerts when model behavior...	22	Experimental	—	TypeScript
52	tyabu12/hamoru "Terraform for LLMs." Declaratively orchestrate multiple LLM providers in...	22	Experimental	—	Rust
53	brookrunning734/trace-ui Visualize and analyze large-scale ARM64 execution traces with fast browsing,...	22	Experimental	—	Rust
54	node-llm/node-llm-monitor Production-grade observability for LLM applications in Node.js.	22	Experimental	1	TypeScript
55	HelgeSverre/llmflow Local-first LLM observability. Trace agents, chains, and LLM calls with...	21	Experimental	—	JavaScript
56	romanmatena/browsermonitor Browser console and network monitoring for debugging and LLM workflows....	21	Experimental	—	JavaScript
57	LakshmiSravyaVedantham/llm-lens A flight recorder for AI agents — replay every LLM call step-by-step to find...	21	Experimental	—	Rust
58	ogulcanaydogan/LLM-SLO-eBPF-Toolkit eBPF-based SLO observability for LLM inference latency on Kubernetes	21	Experimental	—	Go
59	voynow/maintainability LLM driven static code analysis for quantifying maintainability [no longer active]	20	Experimental	6	JavaScript
60	cmangun/llm-observability-dashboards Prometheus + Grafana observability stack for LLM-powered systems	20	Experimental	1	JavaScript
61	quarktetra23/LLM_staticanalysis Pylint Code Analaysis for LLM's	18	Experimental	1	—
62	AjaCHN/LLM-API-Sentinel 全球主流大模型 API 实时监控与历史可用性追踪系统。Real-time monitoring and historical availability...	17	Experimental	1	TypeScript
63	modelmetry/modelmetry-sdk-js The Modelmetry JS/TS SDK allows developers to easily integrate Modelmetry’s...	17	Experimental	1	TypeScript
64	Scale3-Labs/langtrace-trace-attributes Trace Attributes for Langtrace	16	Experimental	—	TypeScript
65	medtotti/nektor 🔍 Generate AI-powered tail-based sampling policies for Honeycomb Refinery...	14	Experimental	—	—
66	AbdelStark/dumbmeter A daily snapshot of when popular models drift from their baseline. Auto...	14	Experimental	1	TypeScript
67	prkbuilds/otel-ai-go OpenTelemetry GenAI semantic conventions for Go: drop-in HTTP middleware,...	14	Experimental	1	Go
68	sec-view/FluxPeek FluxPeek is a desktop app for inspecting huge dataset files with...	14	Experimental	1	Rust
69	codex-odyssey/llm-observability 技術書典#17 - 『俺たちと探究するLLM Observabilityアプリケーションのオブザーバビリティ』で使用するサンプルアプリケーション	14	Experimental	18	Jupyter Notebook
70	kodlan/llm-observability-pack Ready-to-run observability stack for LLM inference servers (vLLM, Triton)...	13	Experimental	—	Python
71	moondef/vhs Record, edit, and replay LLM HTTP interactions. Deterministic testing for AI...	13	Experimental	—	Rust
72	Leizhenpeng/alfred-workflow-llm-token-check 🤞 Alfred 5 workflow: Quickly test whether an API token is valid	11	Experimental	4	—
73	PR0CK0/ProvTracer ProvTracer is a system developed to automatically capture digital artifact...	11	Experimental	2	Python

Comparisons in this category

openinference and openllmetry (73 vs 68) openllmetry and website (68 vs 45) llm-sandbox and MPLSandbox (72 vs 46) openinference and website (73 vs 45) langtrace-python-sdk and langtrace-typescript-sdk (55 vs 38)