All AI Evaluation Tools
216 tools ranked by quality score
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
DataDog/dd-trace-js
Datadog APM client for Node.js |
|
Verified |
| 2 |
lmnr-ai/lmnr
Laminar - open-source observability platform purpose-built for AI agents. YC S24. |
|
Verified |
| 3 |
mnfst/manifest
Smart LLM Routing for OpenClaw. Cut Costs up to 70% 🦞🦚 |
|
Verified |
| 4 |
open-telemetry/opentelemetry-rust
The Rust OpenTelemetry implementation |
|
Verified |
| 5 |
tokio-rs/tracing
Application level tracing for Rust. |
|
Verified |
| 6 |
DataDog/dd-trace-go
Datadog Go Library including APM tracing, profiling, and security monitoring. |
|
Verified |
| 7 |
pinpoint-apm/pinpoint
APM, (Application Performance Management) tool for large-scale distributed systems. |
|
Verified |
| 8 |
DataDog/dd-trace-py
Datadog Python APM Client |
|
Verified |
| 9 |
open-telemetry/opentelemetry-go
OpenTelemetry Go API and SDK |
|
Verified |
| 10 |
jaegertracing/jaeger-ui
Web UI for Jaeger |
|
Verified |
| 11 |
DataDog/datadog-agent
Main repository for Datadog Agent |
|
Verified |
| 12 |
open-telemetry/opentelemetry-go-instrumentation
OpenTelemetry Auto Instrumentation using eBPF |
|
Verified |
| 13 |
opentracing-contrib/nginx-opentracing
NGINX plugin for OpenTracing |
|
Verified |
| 14 |
openzipkin/zipkin
Zipkin is a distributed tracing system |
|
Verified |
| 15 |
NVIDIA/garak
the LLM vulnerability scanner |
|
Verified |
| 16 |
winsiderss/systeminformer
A free, powerful, multi-purpose tool that helps you monitor system... |
|
Verified |
| 17 |
namhyung/uftrace
Function graph tracer for C/C++/Rust/Python |
|
Verified |
| 18 |
jaegertracing/jaeger
CNCF Jaeger, a Distributed Tracing Platform |
|
Verified |
| 19 |
autogluon/fev
Forecast evaluation library |
|
Verified |
| 20 |
confident-ai/deepeval
The LLM Evaluation Framework |
|
Verified |
| 21 |
inikep/lzbench
lzbench is an in-memory benchmark of open-source compressors |
|
Verified |
| 22 |
bpftrace/bpftrace
High-level tracing language for Linux |
|
Verified |
| 23 |
gofr-dev/gofr
An opinionated GoLang framework for accelerated microservice development.... |
|
Verified |
| 24 |
SigNoz/signoz
SigNoz is an open-source observability platform native to OpenTelemetry with... |
|
Verified |
| 25 |
GreptimeTeam/greptimedb
The open-source Observability 2.0 database. One engine for metrics, logs,... |
|
Verified |
| 26 |
libbpf/libbpf
Automated upstream mirror for libbpf stand-alone build. |
|
Verified |
| 27 |
iipeace/guider
The All-in-One System Profiling and Fault Detection Tool for Linux & Android |
|
Verified |
| 28 |
pydantic/logfire
AI observability platform for production LLM and agent systems. |
|
Established |
| 29 |
CodSpeedHQ/pytest-codspeed
A pytest plugin to create benchmarks |
|
Established |
| 30 |
dotnet/BenchmarkDotNet
Powerful .NET library for benchmarking |
|
Established |
| 31 |
CodSpeedHQ/codspeed-rust
Crates to benchmark your Rust code |
|
Established |
| 32 |
alibaba/loongsuite-go-agent
OpenTelemetry Compile-Time Instrumentation for Golang |
|
Established |
| 33 |
coroot/coroot
Coroot is an open-source observability and APM tool with AI-powered Root... |
|
Established |
| 34 |
flightlessmango/MangoHud
A Vulkan and OpenGL overlay for monitoring FPS, temperatures, CPU/GPU load and more. |
|
Established |
| 35 |
metrico/gigapipe
⭐️ The Open-Source Polyglot Observability Warehouse: Light, Fast, Cloud... |
|
Established |
| 36 |
TPC-Council/HammerDB
HammerDB: The industry standard open-source database benchmark |
|
Established |
| 37 |
DataDog/dd-trace-java
Datadog APM client for Java |
|
Established |
| 38 |
DataDog/dd-trace-php
Datadog PHP Clients |
|
Established |
| 39 |
DataDog/dd-trace-rb
Datadog's client library for Ruby |
|
Established |
| 40 |
jaegertracing/helm-charts
Helm Charts for Jaeger backend |
|
Established |
| 41 |
DataDog/dd-sdk-ios
Datadog SDK for iOS - Swift and Objective-C. |
|
Established |
| 42 |
open-telemetry/opentelemetry-ruby-contrib
Contrib Packages for the OpenTelemetry Ruby API and SDK implementation. |
|
Established |
| 43 |
gogf/gf
A powerful framework for faster, easier, and more efficient project development. |
|
Established |
| 44 |
RafaelGSS/bench-node
A powerful Node.js benchmark library |
|
Established |
| 45 |
DataDog/dd-trace-dotnet
.NET Client Library for Datadog APM |
|
Established |
| 46 |
open-telemetry/opentelemetry-php
The OpenTelemetry PHP Library |
|
Established |
| 47 |
reframe-hpc/reframe
A powerful Python framework for writing and running portable regression... |
|
Established |
| 48 |
verifywise-ai/verifywise
Complete AI governance and LLM Evals platform with support for EU AI Act,... |
|
Established |
| 49 |
rabbitmq/rabbitmq-perf-test
A load testing tool |
|
Established |
| 50 |
oushujun/EDTA
Extensive de-novo TE Annotator |
|
Established |
| 51 |
nowsecure/fsmon
Filesystem monitor tool for Linux/Android iOS/macOS |
|
Established |
| 52 |
typelevel/natchez
functional tracing for cats |
|
Established |
| 53 |
cloudflare/ebpf_exporter
Prometheus exporter for custom eBPF metrics |
|
Established |
| 54 |
zio/zio-logging
Powerful logging for ZIO 2.0 applications, with compatibility with many... |
|
Established |
| 55 |
lttng/lttng-tools
The lttng-tools project provides a session daemon (lttng-sessiond) that acts... |
|
Established |
| 56 |
efficios/babeltrace
Babeltrace /ˈbæbəltreɪs/ is an open-source trace manipulation toolkit. |
|
Established |
| 57 |
huggingface/aisheets
Build, enrich, and transform datasets using AI models with no code |
|
Established |
| 58 |
typelevel/otel4s
An OpenTelemetry library for Scala based on Cats-Effect |
|
Established |
| 59 |
fastify/fastify-zipkin
Fastify plugin for Zipkin distributed tracing system. |
|
Established |
| 60 |
dash0hq/otelbin
Web-based tool to facilitate OpenTelemetry collector configuration editing... |
|
Established |
| 61 |
iand675/hs-opentelemetry
OpenTelemetry support for the Haskell programming language |
|
Established |
| 62 |
swift-otel/swift-otel
An OpenTelemetry Protocol (OTLP) backend for Swift Log, Swift Metrics, and... |
|
Established |
| 63 |
godotengine/godot-benchmarks
Collection of benchmarks to test performance of different areas of Godot |
|
Established |
| 64 |
cilium/pwru
Packet, where are you? -- eBPF-based Linux kernel networking debugger |
|
Established |
| 65 |
instana/go-sensor
:rocket: Go Distributed Tracing & Metrics Sensor for Instana |
|
Established |
| 66 |
signalfx/tracing-examples
Examples of using third-party tracers with SignalFx |
|
Established |
| 67 |
signalfx/splunk-otel-java
Splunk Distribution of OpenTelemetry Java |
|
Established |
| 68 |
instana/nodejs
Node.js in-process collectors for Instana |
|
Established |
| 69 |
team-decent/decent-bench
A benchmarking framework for decentralized optimization |
|
Established |
| 70 |
kieker-monitoring/kieker
Kieker is an observability framework, that consists of an monitoring and... |
|
Established |
| 71 |
jonahsnider/benchmark
A Node.js benchmarking library with support for multithreading and TurboFan... |
|
Established |
| 72 |
dynatrace-oss/unguard
Unguard is an insecure cloud-native microservices demo application. |
|
Established |
| 73 |
instana/python-sensor
:snake: Python Distributed Tracing & Metrics Sensor for Instana |
|
Established |
| 74 |
munich-quantum-toolkit/bench
MQT Bench - An MQT Tool for Benchmarking Quantum Software Tools |
|
Established |
| 75 |
ertgl/tapable-tracer
Trace the connections and flows between tapable hooks. |
|
Established |
| 76 |
uio-bmi/immuneML
immuneML is a platform for machine learning analysis of adaptive immune... |
|
Established |
| 77 |
ant-research/EasyTemporalPointProcess
EasyTPP: Towards Open Benchmarking Temporal Point Processes |
|
Established |
| 78 |
nhsengland/evalsense
Tools for systematic large language model evaluations |
|
Established |
| 79 |
instana/ruby-sensor
💎 Ruby Distributed Tracing & Metrics Sensor for Instana |
|
Established |
| 80 |
atesgoral/hrm-solutions
Human Resource Machine solutions and size/speed hacks |
|
Established |
| 81 |
bamlab/flashlight
📱⚡️ Lighthouse for Mobile - audits your app and gives a performance score to... |
|
Established |
| 82 |
ldbc/ldbc_snb_docs
Specification of the LDBC Social Network Benchmark suite |
|
Established |
| 83 |
aliesbelik/load-testing-toolkit
Collection of open-source tools for debugging, benchmarking, load and stress... |
|
Established |
| 84 |
unitaryfoundation/metriq-gym
metriq-gym is a framework for implementing and running standard quantum... |
|
Established |
| 85 |
ryncsn/memstrack
A memory allocation tracer combined with stack trace. |
|
Established |
| 86 |
GDATASoftwareAG/motornet
Motor.NET is a microservice framework based on Microsoft.Extensions.Hosting |
|
Established |
| 87 |
argonne-lcf/THAPI
A tracing infrastructure for heterogeneous computing applications. |
|
Established |
| 88 |
DataDog/nginx-datadog
Enhance NGINX Observability and Security with Datadog's Module |
|
Established |
| 89 |
bencheeorg/benchee
Easy and extensible benchmarking in Elixir providing you with lots of statistics! |
|
Established |
| 90 |
chirpz-ai/pandaprobe
🐼 Open source agent engineering platform: traces, evals, and metrics to... |
|
Established |
| 91 |
jnidzwetzki/pg-lock-tracer
An eBPF based lock tracer for PostgreSQL |
|
Established |
| 92 |
cau-se/theodolite
Theodolite is a framework for benchmarking the horizontal and vertical... |
|
Established |
| 93 |
bencherdev/bencher
🐰 Bencher - Continuous Benchmarking |
|
Established |
| 94 |
hendriknielaender/zBench
📊 zig benchmark |
|
Established |
| 95 |
DataDog/dd-trace-cpp
Datadog APM client for C++ |
|
Established |
| 96 |
cmackenzie1/tracing-ndjson
A customizable NDJSON format for tracing in Rust |
|
Established |
| 97 |
prestodb/pbench
Presto/Prestissimo Benchmark Toolset |
|
Established |
| 98 |
elastic/elastic-otel-dotnet
Elastic OpenTelemetry .NET Distribution |
|
Established |
| 99 |
signalfx/splunk-otel-dotnet
Splunk Distribution of OpenTelemetry .NET |
|
Established |
| 100 |
FrankChen021/bithon
A full stack observability platform |
|
Established |