Llm Evaluation Platforms AI Agents

There are 3 llm evaluation platforms agents tracked. 1 score above 50 (established tier). The highest-rated is katanemo/plano at 64/100 with 5,953 stars. 1 of the top 10 are actively maintained.

Get all 3 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=agents&subcategory=llm-evaluation-platforms&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Agent	Score	Tier	Stars	Language
1	katanemo/plano Plano is an AI-native proxy and data plane for agentic apps — with built-in...	64	Established	5,953	Rust
2	abhiai-git/agent_trajectory_evaluation agent_trajectory_evaluation is a Python package designed to evaluate the...	17	Experimental	—	Python
3	Tarunjit45/local-ai-safety-auditor An implementation of Asynchronous AI Oversight using local Small Language...	13	Experimental	—	Python