truera/trulens
Evaluation and Tracking for LLM Experiments and AI Agents
This tool helps AI engineers and developers systematically evaluate and track their Large Language Model (LLM) application experiments. It takes your LLM application's prompts, models, retrievers, and knowledge sources as input, and provides detailed feedback and performance insights to help you identify failure modes. The output enables you to understand and improve your application's behavior and performance.
3,160 stars. Actively maintained with 9 commits in the last 30 days. Available on PyPI.
Use this if you are building or iterating on an LLM-powered application and need a structured way to test, compare, and improve different versions of your app.
Not ideal if you are looking for a simple API wrapper for LLMs or a general-purpose data logging tool without specific LLM evaluation needs.
Stars
3,160
Forks
251
Language
Python
License
MIT
Category
Last pushed
Mar 10, 2026
Commits (30d)
9
Dependencies
5
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/agents/truera/trulens"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Featured in
Compare
Related agents
traceroot-ai/traceroot
Find the Root Cause in Your Code's Trace
future-agi/traceAI
Open Source AI Tracing Framework built on Opentelemetry for AI Applications and Frameworks
evilmartians/agent-prism
React components for visualizing traces from AI agents
VishApp/multiagent-debugger
Multi-Agent Debugger: An AI-powered debugging system using CrewAI to orchestrate specialized...
InftyAI/alphatrion
⚒️ AlphaTrion is an open-source observability platform for AI agents, including experiment...