Uncategorized AI Evaluation Tools

There are 216 uncategorized tools tracked. 27 score above 70 (verified tier). The highest-rated is DataDog/dd-trace-js at 95/100 with 790 stars and 26,477,155 monthly downloads. 10 of the top 10 are actively maintained.

Get all 216 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=ai-evals&subcategory=uncategorized&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Tool	Score	Tier	Stars	Language
1	DataDog/dd-trace-js Datadog APM client for Node.js	95	Verified	790	JavaScript
2	lmnr-ai/lmnr Laminar - open-source observability platform purpose-built for AI agents. YC S24.	88	Verified	2,770	TypeScript
3	mnfst/manifest Smart LLM Routing for OpenClaw. Cut Costs up to 70% 🦞🦚	87	Verified	4,268	TypeScript
4	open-telemetry/opentelemetry-rust The Rust OpenTelemetry implementation	81	Verified	2,536	Rust
5	tokio-rs/tracing Application level tracing for Rust.	78	Verified	6,617	Rust
6	DataDog/dd-trace-go Datadog Go Library including APM tracing, profiling, and security monitoring.	76	Verified	843	Go
7	pinpoint-apm/pinpoint APM, (Application Performance Management) tool for large-scale distributed systems.	76	Verified	13,819	Java
8	DataDog/dd-trace-py Datadog Python APM Client	76	Verified	637	Python
9	open-telemetry/opentelemetry-go OpenTelemetry Go API and SDK	76	Verified	6,344	Go
10	jaegertracing/jaeger-ui Web UI for Jaeger	76	Verified	1,456	JavaScript
11	DataDog/datadog-agent Main repository for Datadog Agent	76	Verified	3,565	Go
12	open-telemetry/opentelemetry-go-instrumentation OpenTelemetry Auto Instrumentation using eBPF	73	Verified	995	C
13	opentracing-contrib/nginx-opentracing NGINX plugin for OpenTracing	73	Verified	513	C++
14	openzipkin/zipkin Zipkin is a distributed tracing system	73	Verified	17,423	Java
15	NVIDIA/garak the LLM vulnerability scanner	72	Verified	7,510	HTML
16	winsiderss/systeminformer A free, powerful, multi-purpose tool that helps you monitor system...	72	Verified	13,904	C
17	namhyung/uftrace Function graph tracer for C/C++/Rust/Python	72	Verified	3,418	C
18	jaegertracing/jaeger CNCF Jaeger, a Distributed Tracing Platform	72	Verified	22,666	Go
19	autogluon/fev Forecast evaluation library	72	Verified	156	Python
20	confident-ai/deepeval The LLM Evaluation Framework	71	Verified	14,681	Python
21	inikep/lzbench lzbench is an in-memory benchmark of open-source compressors	71	Verified	1,052	C
22	bpftrace/bpftrace High-level tracing language for Linux	71	Verified	10,049	C++
23	gofr-dev/gofr An opinionated GoLang framework for accelerated microservice development....	70	Verified	21,601	Go
24	SigNoz/signoz SigNoz is an open-source observability platform native to OpenTelemetry with...	70	Verified	26,471	TypeScript
25	GreptimeTeam/greptimedb The open-source Observability 2.0 database. One engine for metrics, logs,...	70	Verified	6,136	Rust
26	libbpf/libbpf Automated upstream mirror for libbpf stand-alone build.	70	Verified	2,673	C
27	iipeace/guider The All-in-One System Profiling and Fault Detection Tool for Linux & Android	70	Verified	668	Python
28	pydantic/logfire AI observability platform for production LLM and agent systems.	69	Established	4,158	Python
29	CodSpeedHQ/pytest-codspeed A pytest plugin to create benchmarks	69	Established	122	Python
30	dotnet/BenchmarkDotNet Powerful .NET library for benchmarking	69	Established	11,389	C#
31	CodSpeedHQ/codspeed-rust Crates to benchmark your Rust code	67	Established	59	Rust
32	alibaba/loongsuite-go-agent OpenTelemetry Compile-Time Instrumentation for Golang	66	Established	843	Go
33	coroot/coroot Coroot is an open-source observability and APM tool with AI-powered Root...	66	Established	7,557	Go
34	flightlessmango/MangoHud A Vulkan and OpenGL overlay for monitoring FPS, temperatures, CPU/GPU load and more.	66	Established	8,436	C
35	metrico/gigapipe ⭐️ The Open-Source Polyglot Observability Warehouse: Light, Fast, Cloud...	65	Established	1,655	Go
36	TPC-Council/HammerDB HammerDB: The industry standard open-source database benchmark	64	Established	745	Tcl
37	DataDog/dd-trace-java Datadog APM client for Java	64	Established	703	Java
38	DataDog/dd-trace-php Datadog PHP Clients	64	Established	550	PHP
39	DataDog/dd-trace-rb Datadog's client library for Ruby	64	Established	392	Ruby
40	jaegertracing/helm-charts Helm Charts for Jaeger backend	64	Established	315	Mustache
41	DataDog/dd-sdk-ios Datadog SDK for iOS - Swift and Objective-C.	64	Established	279	Swift
42	open-telemetry/opentelemetry-ruby-contrib Contrib Packages for the OpenTelemetry Ruby API and SDK implementation.	64	Established	119	Ruby
43	gogf/gf A powerful framework for faster, easier, and more efficient project development.	64	Established	13,107	Go
44	RafaelGSS/bench-node A powerful Node.js benchmark library	64	Established	182	JavaScript
45	DataDog/dd-trace-dotnet .NET Client Library for Datadog APM	64	Established	553	C#
46	open-telemetry/opentelemetry-php The OpenTelemetry PHP Library	64	Established	889	PHP
47	reframe-hpc/reframe A powerful Python framework for writing and running portable regression...	63	Established	271	Python
48	verifywise-ai/verifywise Complete AI governance and LLM Evals platform with support for EU AI Act,...	63	Established	256	TypeScript
49	rabbitmq/rabbitmq-perf-test A load testing tool	63	Established	416	Java
50	oushujun/EDTA Extensive de-novo TE Annotator	63	Established	438	Perl
51	nowsecure/fsmon Filesystem monitor tool for Linux/Android iOS/macOS	62	Established	1,005	C
52	typelevel/natchez functional tracing for cats	62	Established	338	Scala
53	cloudflare/ebpf_exporter Prometheus exporter for custom eBPF metrics	62	Established	2,549	Go
54	zio/zio-logging Powerful logging for ZIO 2.0 applications, with compatibility with many...	62	Established	187	Scala
55	lttng/lttng-tools The lttng-tools project provides a session daemon (lttng-sessiond) that acts...	62	Established	343	C++
56	efficios/babeltrace Babeltrace /ˈbæbəltreɪs/ is an open-source trace manipulation toolkit.	62	Established	118	C++
57	huggingface/aisheets Build, enrich, and transform datasets using AI models with no code	61	Established	1,626	TypeScript
58	typelevel/otel4s An OpenTelemetry library for Scala based on Cats-Effect	61	Established	207	Scala
59	fastify/fastify-zipkin Fastify plugin for Zipkin distributed tracing system.	61	Established	10	JavaScript
60	dash0hq/otelbin Web-based tool to facilitate OpenTelemetry collector configuration editing...	61	Established	526	TypeScript
61	iand675/hs-opentelemetry OpenTelemetry support for the Haskell programming language	60	Established	99	Haskell
62	swift-otel/swift-otel An OpenTelemetry Protocol (OTLP) backend for Swift Log, Swift Metrics, and...	60	Established	198	Swift
63	godotengine/godot-benchmarks Collection of benchmarks to test performance of different areas of Godot	60	Established	147	GDScript
64	cilium/pwru Packet, where are you? -- eBPF-based Linux kernel networking debugger	60	Established	3,721	C
65	instana/go-sensor :rocket: Go Distributed Tracing & Metrics Sensor for Instana	60	Established	127	Go
66	signalfx/tracing-examples Examples of using third-party tracers with SignalFx	59	Established	60	C#
67	signalfx/splunk-otel-java Splunk Distribution of OpenTelemetry Java	59	Established	73	Java
68	instana/nodejs Node.js in-process collectors for Instana	59	Established	74	JavaScript
69	team-decent/decent-bench A benchmarking framework for decentralized optimization	59	Established	5	Python
70	kieker-monitoring/kieker Kieker is an observability framework, that consists of an monitoring and...	59	Established	109	Java
71	jonahsnider/benchmark A Node.js benchmarking library with support for multithreading and TurboFan...	59	Established	15	TypeScript
72	dynatrace-oss/unguard Unguard is an insecure cloud-native microservices demo application.	59	Established	68	TypeScript
73	instana/python-sensor :snake: Python Distributed Tracing & Metrics Sensor for Instana	58	Established	69	Python
74	munich-quantum-toolkit/bench MQT Bench - An MQT Tool for Benchmarking Quantum Software Tools	58	Established	113	Python
75	ertgl/tapable-tracer Trace the connections and flows between tapable hooks.	58	Established	5	TypeScript
76	uio-bmi/immuneML immuneML is a platform for machine learning analysis of adaptive immune...	58	Established	74	Python
77	ant-research/EasyTemporalPointProcess EasyTPP: Towards Open Benchmarking Temporal Point Processes	57	Established	336	Python
78	nhsengland/evalsense Tools for systematic large language model evaluations	57	Established	3	Python
79	instana/ruby-sensor 💎 Ruby Distributed Tracing & Metrics Sensor for Instana	56	Established	26	Ruby
80	atesgoral/hrm-solutions Human Resource Machine solutions and size/speed hacks	56	Established	458	Assembly
81	bamlab/flashlight 📱⚡️ Lighthouse for Mobile - audits your app and gives a performance score to...	55	Established	1,556	TypeScript
82	ldbc/ldbc_snb_docs Specification of the LDBC Social Network Benchmark suite	55	Established	60	TeX
83	aliesbelik/load-testing-toolkit Collection of open-source tools for debugging, benchmarking, load and stress...	54	Established	237	—
84	unitaryfoundation/metriq-gym metriq-gym is a framework for implementing and running standard quantum...	54	Established	29	Python
85	ryncsn/memstrack A memory allocation tracer combined with stack trace.	54	Established	175	C
86	GDATASoftwareAG/motornet Motor.NET is a microservice framework based on Microsoft.Extensions.Hosting	54	Established	43	C#
87	argonne-lcf/THAPI A tracing infrastructure for heterogeneous computing applications.	54	Established	41	C
88	DataDog/nginx-datadog Enhance NGINX Observability and Security with Datadog's Module	54	Established	41	C++
89	bencheeorg/benchee Easy and extensible benchmarking in Elixir providing you with lots of statistics!	54	Established	1,506	Elixir
90	chirpz-ai/pandaprobe 🐼 Open source agent engineering platform: traces, evals, and metrics to...	54	Established	7	Python
91	jnidzwetzki/pg-lock-tracer An eBPF based lock tracer for PostgreSQL	54	Established	165	Python
92	cau-se/theodolite Theodolite is a framework for benchmarking the horizontal and vertical...	53	Established	56	Java
93	bencherdev/bencher 🐰 Bencher - Continuous Benchmarking	53	Established	819	Rust
94	hendriknielaender/zBench 📊 zig benchmark	53	Established	199	Zig
95	DataDog/dd-trace-cpp Datadog APM client for C++	53	Established	19	C++
96	cmackenzie1/tracing-ndjson A customizable NDJSON format for tracing in Rust	53	Established	3	Rust
97	prestodb/pbench Presto/Prestissimo Benchmark Toolset	53	Established	8	Go
98	elastic/elastic-otel-dotnet Elastic OpenTelemetry .NET Distribution	53	Established	32	C#
99	signalfx/splunk-otel-dotnet Splunk Distribution of OpenTelemetry .NET	52	Established	9	C#
100	FrankChen021/bithon A full stack observability platform	52	Established	27	Java
101	beling/bsuccinct-rs Rust libraries and programs focused on succinct data structures	52	Established	159	Rust
102	DataDog/orchestrion Automatic compile-time instrumentation of Go code	52	Established	579	Go
103	FriendsOfOpenTelemetry/opentelemetry-bundle Traces, metrics, and logs instrumentation within your Symfony application	52	Established	64	PHP
104	qwerty541/dns-bench Find the fastest DNS in your location to improve internet browsing experience.	52	Established	97	Rust
105	ldcsaa/hp-soa A fully functional, easy-to-use, and highly scalable microservice framework	51	Established	95	Java
106	tlog-dev/tlog Observability events system.	51	Established	18	Go
107	ecoAPM/BenchmarkMockNet Using BenchmarkDotNet to compare .NET mocking library performance	51	Established	24	C#
108	smarr/ReBenchDB ReBenchDB records benchmark results and provides customizable reporting to...	51	Established	18	TypeScript
109	vincentfree/opentelemetry Open Telemetry extensions	51	Established	24	Go
110	Point72/raydar A perspective powered, user editable ray dashboard via ray serve	51	Established	56	Python
111	quochuydev/dokploy-grafana-compose Docker Compose stack for Grafana observability: Tempo traces, Loki logs,...	50	Established	18	—
112	ROCm/madengine madengine is a streamlined CLI tool for running and benchmarking AI models...	50	Established	6	Python
113	nfrankel/opentelemetry-tracing Demo for end-to-end tracing via OpenTelemetry	50	Established	77	Kotlin
114	CodSpeedHQ/action Github Actions for running CodSpeed in your CI	50	Established	52	Shell
115	kieker-monitoring/moobench Micro-benchmarks for quantification of the performance overhead caused by...	50	Established	6	Shell
116	ipyflow/ipyflow A reactive Python kernel for Jupyter notebooks.	50	Established	1,265	Python
117	KaykCaputo/oracletrace Lightweight Python tool to detect performance regressions and compare...	49	Emerging	15	Python
118	RRZE-HPC/MachineState This CLI tool and Python3 module collects the current system state for documentation	48	Emerging	24	Python
119	dinesh-git17/claudehome An architectural persistence experiment for large language models. Claude’s...	48	Emerging	27	TypeScript
120	ivanfioravanti/llm_context_benchmarks 📊 LLM Context Benchmarks - A comprehensive benchmarking tool for testing...	48	Emerging	50	Python
121	facebookresearch/CUTracer A dynamic binary instrumentation tool for tracing and analyzing CUDA kernel...	48	Emerging	53	Python
122	nyrkio/nyrkio Nyrkiö is an open source platform for detecting performance changes in a...	48	Emerging	65	Python
123	oteldb/oteldb OpenTelemetry signal storage	48	Emerging	68	Go
124	tw4452852/zbpf Writing eBPF in Zig	48	Emerging	259	Zig
125	JDiskMark/jdm-java Cross-platform Java Disk Benchmark Utility for measuring drive IO performance.	48	Emerging	4	Java
126	lucsorel/pydoctrace Generate architecture diagrams by tracing Python code execution	48	Emerging	17	Python
127	komoju/komoju-datadog Rust Datadog instrumentation	48	Emerging	4	Rust
128	mesaglio/otel-front Lightweight OpenTelemetry viewer for local development. Single binary, no...	47	Emerging	39	TypeScript
129	Helmholtz-AI-Energy/perun Perun is a Python package that measures the energy consumption of your applications.	47	Emerging	91	Python
130	containerscrew/nflux Simple network monitoring agent tool. Powered by eBPF & Rust 🐝	47	Emerging	9	Rust
131	blooop/bencher A package for benchmarking the characteristics of arbitrary functions	46	Emerging	4	Python
132	GabrielTecuceanu/httpress a fast HTTP benchmarking tool built in Rust	46	Emerging	10	Rust
133	DataDog/httpd-datadog Enhance Apache HTTPD Observability with Datadog's Module	46	Emerging	4	Python
134	proactive-agent/langgraphics Visualize live LangGraph execution and see how your agent thinks as it runs.	45	Emerging	88	TypeScript
135	CodSpeedHQ/instrument-hooks Internal core for the codspeed instruments	45	Emerging	2	C
136	kjldev/purview-telemetry-sourcegenerator .NET Source Generator for interface-based telemetry. Supporting activities,...	45	Emerging	30	C#
137	agurinov/gopl Golang platform library	45	Emerging	5	Go
138	grafana/otel-profiling-go Open Telemetry integration for Grafana Pyroscope and tracing solutions such...	45	Emerging	101	Go
139	Spectral-Knight-Ops/local-llm-evaluator Quickly test local LLMs with custom prompts to determine which model is best for you.	45	Emerging	8	Python
140	feelpp/benchmarking Feel++ Benchmarking	45	Emerging	3	Python
141	gstinoco/mGFD Meshless Generalized Finite Differences (mGFD) solver and reference...	44	Emerging	4	Python
142	shnarazk/SAT-bench A benchmark suit for SAT solvers	44	Emerging	2	Rust
143	uptrace/uptrace-ruby OpenTelemetry Ruby distribution for Uptrace	44	Emerging	3	Ruby
144	coralogix/coralogix-management-sdk API clients for configuring the Coralogix platform.	44	Emerging	4	Go
145	omniviser/omniray Stop guessing! You and your AI can now see live what's happening inside your...	43	Emerging	4	Python
146	HPE/torch-hammer Torch Hammer: Strike while the GPU is hot	43	Emerging	9	Python
147	typelevel/otel4s-sdk Implementation of the otel4s SDK modules in Scala from scratch	43	Emerging	5	Scala
148	falcondev-oss/workflow Simple type-safe queue worker with durable execution based on BullMQ.	42	Emerging	2	TypeScript
149	beorn/loggily TypeScript logger with debug-style namespaces, structured JSON, and...	42	Emerging	2	TypeScript
150	givecareapp/givecare-bench AI safety benchmark for long-term caregiving relationships. Tests crisis...	42	Emerging	2	Python
151	NyanKiyoshi/pytest-django-queries Generate performance reports from your django database performance tests.	42	Emerging	83	Python
152	pgx-contrib/pgxotel OpenTelemetry tracing instrumentation for pgx v5 — spans for queries,...	41	Emerging	8	Go
153	skerkour/go-benchmarks Comprehensive and reproducible benchmarks for Go developers and architects.	41	Emerging	13	Go
154	rsasaki0109/CloudAnalyzer CLI-first QA toolkit for point clouds, trajectories, and 3D perception...	41	Emerging	10	Python
155	MrAlias/flow An OpenTelemetry SpanProcessor reporting tracing flow metrics	41	Emerging	10	Go
156	udhos/opentelemetry-trace-sqs opentelemetry-trace-sqs propagates Open Telemetry tracing with SQS messages...	41	Emerging	8	Go
157	jamesgober/metrics-lib The fastest metrics library for Rust. Lock-free 0.6ns gauges, 18ns counters,...	41	Emerging	7	Rust
158	smyrgeorge/log4k A Comprehensive Logging and Tracing Solution for Kotlin Multiplatform.	40	Emerging	62	Kotlin
159	KempnerInstitute/nvidia-hpc-benchmarks NVIDIA HPC Benchmarks	40	Emerging	10	Shell
160	meshkovQA/Eval-ai-library Comprehensive AI Model Evaluation Framework with advanced techniques...	39	Emerging	31	Python
161	getaxonflow/axonflow AxonFlow: Runtime control layer for production AI	39	Emerging	43	Go
162	IBM/OpenDsStar OpenDsStar is an open-source implementation of the DS-Star agent that...	38	Emerging	15	Python
163	kobsio/kobs Kubernetes Observability Platform	37	Emerging	216	TypeScript
164	hdmsantander/microservices-ops-demo Spring Boot demo for observability, traceability and error analysis in a...	37	Emerging	4	Java
165	mbzuai-oryx/Agent-X ICLR 2026: Agent-X Evaluating Deep Multimodal Reasoning in Vision-Centric...	37	Emerging	39	Jupyter Notebook
166	evaluation-context-protocol/ecp ECP is a standardized interface for orchestrating, auditing, and enforcing...	37	Emerging	7	Python
167	verifywise-ai/plugin-marketplace VerifyWise AI Governance Plugin Marketplace	36	Emerging	3	TypeScript
168	braintrustdata/braintrust-pi-extension Braintrust tracing plugin for pi	36	Emerging	2	TypeScript
169	nixel2007/opentelemetry OpenTelemetry SDK для OneScript	36	Emerging	8	1C Enterprise
170	everythings-gonna-be-alright/phpScope PHP profiler that sends CPU sampling data to Pyroscope server.	36	Emerging	17	Go
171	opsrobot-ai/opsrobot Observability platform for OpenClaw agents, providing real-time tracing,...	35	Emerging	76	JavaScript
172	kolloch/reqray Log call tree summaries after each request for rust programs instrumented...	35	Emerging	45	Rust
173	tracewayapp/opentelemetry-symfony-bundle Pure-PHP OpenTelemetry instrumentation for Symfony - automatic HTTP,...	35	Emerging	57	PHP
174	PacificBiosciences/aardvark A tool for sniffing out the differences in vari-Ants	35	Emerging	40	Rust
175	yonatan-h/express-k6-profiler Finds bottlenecks in an Express app during load testing	34	Emerging	14	TypeScript
176	cuihairu/croupier Croupier is a universal GM (Game Master) backend system designed for game...	34	Emerging	13	Go
177	aykhans/sarin A high-performance HTTP load testing tool. Features dynamic request...	33	Emerging	7	Go
178	dolmen-go/flagx Extensions for the Go 'flag' package: flagx, flagfile, flagnet, flagtrace	32	Emerging	3	Go
179	MrAlias/collex Use OpenTelemetry Collector Factories to Export with OpenTelemetry Go	32	Emerging	3	Go
180	rodneylab/axum-graphql Rust GraphQL demo/test API written in Rust, using Axum for routing,...	31	Emerging	2	Rust
181	AmalChandru/termtrace A terminal workflow recorder that turns debugging sessions into replayable,...	31	Emerging	26	Go
182	last9/opentelemetry-examples Production-ready OpenTelemetry instrumentation examples for Go, Python,...	31	Emerging	3	Python
183	PAIR-Systems-Inc/little-dorrit-editor Multimodal benchmark for evaluating handwritten editorial correction in printed text.	31	Emerging	2	Python
184	filipsPL/optuml Optuna-optimized ML methods, with scikit-learn like API	31	Emerging	2	Python
185	BudEcosystem/bud-runtime Bud AI Foundry - A comprehensive inference stack for compound AI deployment,...	31	Emerging	2	Python
186	russfellows/sai3-bench A multi-protocol storage performance testing tool, inspired by vdbench, fio...	30	Emerging	2	Rust
187	hboublal/dopGuard Modular observability platform for .NET applications, integrating with tools...	30	Emerging	2	C#
188	imadAttar/spring-boot-unified-observability-starter All-in-one Spring Boot Starter for Observability: Metrics, Traces, Logs, and...	30	Emerging	6	Java
189	nshkrdotcom/AITrace The unified observability layer for the AI Control Plane	30	Emerging	2	Elixir
190	qcmet/qcmet Quantum Computing Metrics and Benchmarks	30	Emerging	5	Jupyter Notebook
191	tolitius/cupel discover LLMs punching above their weight	29	Experimental	28	JavaScript
192	wangyz1999/sync-video-label A web-based annotation tool for synchronized multi-video timeline labeling...	29	Experimental	17	TypeScript
193	iRevive/fs2-grpc-otel4s otel4s instrumentation for fs2-grpc	28	Experimental	2	Scala
194	mnemom/mnemom-platform Safe House for AI agents — transparent gateway with inbound + outbound...	28	Experimental	6	TypeScript
195	rvnhq/raven A lightweight, self-hostable cloud infrastructure monitoring and telemetry platform.	28	Experimental	5	Rust
196	DaSH-Lab-CSIS/blossom BLOSSOM: Block-wise Federated Learning Over Shared and Sparse Observed...	27	Experimental	3	Python
197	kyahikaru/llm-guardrail-red-teaming Protocol constrained red teaming of frontier LLM guardrails in high risk...	27	Experimental	1	—
198	last9/rails-otel-context Tells you which code fired that query. Zero config.	27	Experimental	3	Ruby
199	thanhdaon/clean-arch-go Clean Architecture, DDD, CQRS with testings in Go	27	Experimental	19	Go
200	LLMSystems/BehaviorRL-Hallucination Learning When to Answer: Behavior-Oriented Reinforcement Learning for...	26	Experimental	7	Python
201	maxi4youuu/RePRo 🧠 Enhance raw prompts into optimized, powerful versions for AI tools like...	26	Experimental	2	TypeScript
202	Anarv2104/Inflion Observability and influence tracing infrastructure for multi-agent AI systems.	26	Experimental	2	Python
203	HiThink-Research/FinMTM [ACL 2026] FinMTM: A Multi-Turn Multimodal Benchmark for Financial Reasoning...	25	Experimental	25	Python
204	fourdollars/cella A terminal UI and CLI for managing and monitoring LXD + Docker containers —...	25	Experimental	3	Go
205	FelixBroesamle/s2mflow Meta-generator: generating multicommodity flow instances from...	24	Experimental	2	Rust
206	iazaran/trace-replay High-fidelity process tracking, deterministic replay, and AI-powered...	24	Experimental	2	PHP
207	Basaltlabs-app/Gauntlet Community-driven behavioral reliability benchmark for LLMs. 88 probes across...	24	Experimental	2	Python
208	SagarMaheshwary/reqlog Fast CLI to search and trace logs across services or single files using...	24	Experimental	2	Go
209	TomasVenkrbec/lazyline Zero-config line-level Python profiler. No decorators, no code changes....	24	Experimental	2	Python
210	0xMilord/better-logger Execution flow debugger for modern apps. Turn scattered `console.log` calls...	24	Experimental	2	TypeScript
211	vikpant/strategic-coopetition Coopetition-Gym: A research-grade mixed-motive multi-agent reinforcement...	23	Experimental	2	Python
212	bajajku/VAC Develop and evaluate a trauma-informed LLM-based chatbot that is...	22	Experimental	2	Python
213	parsamivehchi/tps.sh tps.sh — Tokens Per Second LLM Benchmark. 7 models, 147 tests, 21 prompts...	18	Experimental	2	Python
214	Zxela/claude-monitor Real-time dashboard for monitoring Claude Code sessions — live token usage,...	16	Experimental	2	Go
215	pilhuhn/otel-oql An experiment in creating a OpenTelemetry backend	16	Experimental	2	Go
216	MarkIvor/officeiq Исследовательский вопрос: можно ли измерить «офисный интеллект» LLM? Попытка...	15	Experimental	2	HTML