TonicAI/tvallogging
A tool for evaluating and tracking your RAG experiments. This repo contains the Python SDK for logging to Tonic Validate.
This tool helps AI engineers and developers building Retrieval Augmented Generation (RAG) applications to track and improve their model performance. You feed in your RAG application's responses to a benchmark dataset, along with the context it retrieved. The tool then scores these outputs using RAG metrics and visualizes the results, making it easy to compare different versions of your application.
No commits in the last 6 months.
Use this if you are developing RAG applications and need a systematic way to evaluate, track, and compare the performance of your models over time, ensuring they deliver accurate and relevant information.
Not ideal if you are looking for a general-purpose machine learning experiment tracker not specifically focused on RAG, or if you prefer to calculate RAG metrics entirely offline without a dedicated platform.
Stars
8
Forks
2
Language
Python
License
MIT
Category
Last pushed
Dec 08, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/rag/TonicAI/tvallogging"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Featured in
Higher-rated alternatives
vectara/open-rag-eval
RAG evaluation without the need for "golden answers"
DocAILab/XRAG
XRAG: eXamining the Core - Benchmarking Foundational Component Modules in Advanced...
HZYAI/RagScore
⚡️ The "1-Minute RAG Audit" — Generate QA datasets & evaluate RAG systems in Colab, Jupyter, or...
AIAnytime/rag-evaluator
A library for evaluating Retrieval-Augmented Generation (RAG) systems (The traditional ways).
microsoft/benchmark-qed
Automated benchmarking of Retrieval-Augmented Generation (RAG) systems