AIAnytime/rag-evaluator
A library for evaluating Retrieval-Augmented Generation (RAG) systems (The traditional ways).
This tool helps you check the quality of answers generated by AI systems, especially those that combine information retrieval with text generation (RAG systems). You provide an AI's answer, the original question, and a perfect reference answer, and it tells you how good the AI's answer is. This is ideal for AI developers, researchers, and anyone building or testing conversational AI applications.
No commits in the last 6 months. Available on PyPI.
Use this if you are developing or managing AI systems that generate text and need to quantitatively assess the accuracy, coherence, and fairness of their outputs against known good answers.
Not ideal if you're looking for a tool to generate text, fix grammar, or analyze human-written content for sentiment, as it specifically evaluates AI-generated responses.
Stars
42
Forks
18
Language
Python
License
MIT
Category
Last pushed
Aug 10, 2024
Commits (30d)
0
Dependencies
7
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/rag/AIAnytime/rag-evaluator"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Featured in
Compare
Related tools
vectara/open-rag-eval
RAG evaluation without the need for "golden answers"
DocAILab/XRAG
XRAG: eXamining the Core - Benchmarking Foundational Component Modules in Advanced...
HZYAI/RagScore
⚡️ The "1-Minute RAG Audit" — Generate QA datasets & evaluate RAG systems in Colab, Jupyter, or...
microsoft/benchmark-qed
Automated benchmarking of Retrieval-Augmented Generation (RAG) systems
2501Pr0ject/RAGnarok-AI
Local-first RAG evaluation framework for LLM applications. 100% local, no API keys required.