TonicAI/tonic_validate

Metrics to evaluate the quality of responses of your Retrieval Augmented Generation (RAG) applications.

/ 100

Emerging

This tool helps evaluate the quality of responses from your AI applications that generate text based on retrieved information, like chatbots or intelligent assistants. You provide questions, the answers your AI gives, and the sources it used, and the tool outputs scores and insights into how accurate and truthful your AI's responses are. This is for anyone who builds or manages AI-powered knowledge systems and wants to ensure their AI provides reliable information.

324 stars. No commits in the last 6 months.

Use this if you need to systematically check if your AI assistant or RAG application is providing accurate, relevant, and non-hallucinated answers based on the information it's given.

Not ideal if you are looking to evaluate AI models that generate creative content or images, or if you don't have clear source documents for your AI's responses.

AI-powered assistants chatbots knowledge management information retrieval AI quality assurance

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 15 / 25

How are scores calculated?

Stars

324

Forks

Language

Python

License

MIT

Featured in

You're Shipping AI You Can't Measure

Higher-rated alternatives

vectara/open-rag-eval

RAG evaluation without the need for "golden answers"

DocAILab/XRAG

XRAG: eXamining the Core - Benchmarking Foundational Component Modules in Advanced...

HZYAI/RagScore

⚡️ The "1-Minute RAG Audit" — Generate QA datasets & evaluate RAG systems in Colab, Jupyter, or...

AIAnytime/rag-evaluator

A library for evaluating Retrieval-Augmented Generation (RAG) systems (The traditional ways).

microsoft/benchmark-qed

Automated benchmarking of Retrieval-Augmented Generation (RAG) systems

Explore RAG Tools

All categories Trending RAG directory Insights