LLM Observability & Monitoring LLM Tools

Tools for observing, tracing, monitoring, and evaluating LLM applications in production. Includes metrics collection, span tracking, performance analysis, and system health dashboards. Does NOT include LLM serving infrastructure, prompt management, or general application logging.

There are 73 llm observability & monitoring tools tracked. 3 score above 70 (verified tier). The highest-rated is Arize-ai/openinference at 73/100 with 886 stars. 5 of the top 10 are actively maintained.

Get all 73 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=llm-tools&subcategory=llm-observability-monitoring&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 Arize-ai/openinference

OpenTelemetry Instrumentation for AI Observability

73
Verified
2 vndee/llm-sandbox

Lightweight and portable LLM sandbox runtime (code interpreter) Python library.

72
Verified
3 apache/hertzbeat

An AI-powered next-generation open source real-time observability system.

70
Verified
4 traceloop/openllmetry

Open-source observability for your GenAI or LLM application, based on OpenTelemetry

68
Established
5 utkuozdemir/nvidia_gpu_exporter

Nvidia GPU exporter for prometheus using nvidia-smi binary

59
Established
6 Dynatrace/obslab-llm-observability

Search for a holiday and get destination advice from an LLM. Observability...

56
Established
7 Scale3-Labs/langtrace-python-sdk

Langtrace SDK for Python Applications

55
Established
8 secretflow/trustflow

A privacy-preserving computing system based on TEE.

52
Established
9 Ablustrund/MPLSandbox

MPLSandbox is an out-of-the-box multi-programming language sandbox designed...

46
Emerging
10 openlit/website

Open Source OpenTelemetry-native Observability tool for GenAI and LLMs...

45
Emerging
11 opensearch-project/observability-stack

Opensearch Observability Stack

45
Emerging
12 onvo-ai/loghead

Loghead is a tool that allows LLMs in your vibe coding tool to have access...

44
Emerging
13 clay-good/proxilion-grc

Proxilion GRC is a zero-configuration network-layer MITM proxy that secures...

41
Emerging
14 HuckleR2003/PC_Workman_HCK

Real-time system monitor that explains WHY your PC is slow, not just that...

40
Emerging
15 cchinchilla-dev/agentloom

Deterministic LLM workflow orchestration with native observability,...

40
Emerging
16 mazen160/llmquery

Powerful LLM Query Framework with YAML Prompt Templates. Made for Automation

40
Emerging
17 chaitanyya/lookout

Track, analyze, and improve what LLMs are saying

40
Emerging
18 Scale3-Labs/langtrace-typescript-sdk

Langtrace SDK for NodeJS Applications

38
Emerging
19 jmamda/OpenTrace

A local reverse proxy that records every LLM request/response to SQLite. No...

38
Emerging
20 prajeesh-chavan/OpenLLM-Monitor

OpenLLM Monitor is a plug-and-play, real-time observability dashboard for...

38
Emerging
21 aimusubi/aimusubi

Local-first agentic NetOps framework that connects LLMs to real network...

37
Emerging
22 langfuse/oss-llmops-stack

Modular, open source LLMOps stack that separates concerns: LiteLLM unifies...

36
Emerging
23 mxcrafts/ltrack

Security Observability Framework for ML/AI Model File Loading

36
Emerging
24 ZacAttack/HeapDumpStarDiver

Allows for fast parsing of an HPROF file to parquet format so that it can be...

35
Emerging
25 demml/scouter

Monitoring, Evaluation and Observability for AI Applications

35
Emerging
26 eunomia-bpf/ebpf-knowledge-base

An ebpf knowledge base, based on llama_index and bpf-developer-tutorial

34
Emerging
27 copyleftdev/robin-smesh

🕸️ Decentralized Dark Web OSINT Framework | Rust | SMESH Signal Diffusion |...

34
Emerging
28 mithril-security/blind_llama_client

Zero-trust AI APIs for easy and private consumption of open-source LLMs

33
Emerging
29 raaihank/llm-sentinel

Privacy-first proxy that automatically detects and masks sensitive data...

30
Emerging
30 sarva-20/LLM-Observability-FOSS

🧠 Learn LLM Observability step-by-step using FOSS tools. From zero...

29
Experimental
31 Blastgits/traceway

Traceway: observability for LLM's

27
Experimental
32 eullm/eullm

Open-source platform for creating, distributing and running sovereign...

27
Experimental
33 ftaghiyev/firewall-configuration-interface

A Natural Languange Interface for Firewall Configuration

27
Experimental
34 JehanneDussert/govllm

Production-grade LLM observability stack - sovereign, GDPR-compliant, open source.

27
Experimental
35 thoughtbot/opentelemetry-instrumentation-ruby_llm

OpenTelemetry instrumentation for RubyLLM. 💬🔭

26
Experimental
36 eduardoslonski/telescope

Scalable high-performance async RL post-training framework for LLMs with...

26
Experimental
37 broomva/vigil

OpenTelemetry-native observability — GenAI semantic conventions,...

25
Experimental
38 tied-inc/eval-track

LLM-ML-Observability Toolkits and Serivces

25
Experimental
39 james-martinez/lemonade-dashboard

A management dashboard for Lemonade Server. This extension provides a visual...

25
Experimental
40 saluca-labs/tiresias-core

Tiresias core library — shared primitives for the multi-provider LLM proxy...

25
Experimental
41 kingfs/llm-tracelab

A proxy-based tool for tracing, recording, and replaying LLM API requests.

24
Experimental
42 jwilger/union_square

Wire-tap your application's LLM interactions for performance analysis and...

24
Experimental
43 Pranavh-2004/DevMonitor

DevMonitor is a real-time developer dashboard that aggregates AI research,...

24
Experimental
44 JehanneDussert/llm_governance_monitoring

Production-grade LLM observability stack - sovereign, GDPR-compliant, open source.

23
Experimental
45 andrewn6/traceway

Traceway: observability for LLM's

23
Experimental
46 DiogoRibeiro7/llm-observability-analytics

Observability and analytics layer for LLM systems, capturing...

22
Experimental
47 Skobyn/llm-output-governance

A practical Python toolkit for evaluating, monitoring, and governing LLM...

22
Experimental
48 yeahns278/lemonade-dashboard

Manage Lemonade Server models, backends, and settings within VS Code using a...

22
Experimental
49 GenesisClawbot/llm-drift

LLM drift detector — know within 5 min when GPT-4o, Claude, or Gemini...

22
Experimental
50 MoebiusX/KrystalineX

KrystalineX — Institutional-grade crypto exchange demo platform with...

22
Experimental
51 AdametherzLab/agent-drift-watch

CLI that snapshots LLM prompt/response pairs and alerts when model behavior...

22
Experimental
52 tyabu12/hamoru

"Terraform for LLMs." Declaratively orchestrate multiple LLM providers in...

22
Experimental
53 brookrunning734/trace-ui

Visualize and analyze large-scale ARM64 execution traces with fast browsing,...

22
Experimental
54 node-llm/node-llm-monitor

Production-grade observability for LLM applications in Node.js.

22
Experimental
55 HelgeSverre/llmflow

Local-first LLM observability. Trace agents, chains, and LLM calls with...

21
Experimental
56 romanmatena/browsermonitor

Browser console and network monitoring for debugging and LLM workflows....

21
Experimental
57 LakshmiSravyaVedantham/llm-lens

A flight recorder for AI agents — replay every LLM call step-by-step to find...

21
Experimental
58 ogulcanaydogan/LLM-SLO-eBPF-Toolkit

eBPF-based SLO observability for LLM inference latency on Kubernetes

21
Experimental
59 voynow/maintainability

LLM driven static code analysis for quantifying maintainability [no longer active]

20
Experimental
60 cmangun/llm-observability-dashboards

Prometheus + Grafana observability stack for LLM-powered systems

20
Experimental
61 quarktetra23/LLM_staticanalysis

Pylint Code Analaysis for LLM's

18
Experimental
62 AjaCHN/LLM-API-Sentinel

全球主流大模型 API 实时监控与历史可用性追踪系统。Real-time monitoring and historical availability...

17
Experimental
63 modelmetry/modelmetry-sdk-js

The Modelmetry JS/TS SDK allows developers to easily integrate Modelmetry’s...

17
Experimental
64 Scale3-Labs/langtrace-trace-attributes

Trace Attributes for Langtrace

16
Experimental
65 medtotti/nektor

🔍 Generate AI-powered tail-based sampling policies for Honeycomb Refinery...

14
Experimental
66 AbdelStark/dumbmeter

A daily snapshot of when popular models drift from their baseline. Auto...

14
Experimental
67 prkbuilds/otel-ai-go

OpenTelemetry GenAI semantic conventions for Go: drop-in HTTP middleware,...

14
Experimental
68 sec-view/FluxPeek

FluxPeek is a desktop app for inspecting huge dataset files with...

14
Experimental
69 codex-odyssey/llm-observability

技術書典#17 - 『俺たちと探究するLLM Observabilityアプリケーションのオブザーバビリティ』で使用するサンプルアプリケーション

14
Experimental
70 kodlan/llm-observability-pack

Ready-to-run observability stack for LLM inference servers (vLLM, Triton)...

13
Experimental
71 moondef/vhs

Record, edit, and replay LLM HTTP interactions. Deterministic testing for AI...

13
Experimental
72 Leizhenpeng/alfred-workflow-llm-token-check

🤞 Alfred 5 workflow: Quickly test whether an API token is valid

11
Experimental
73 PR0CK0/ProvTracer

ProvTracer is a system developed to automatically capture digital artifact...

11
Experimental