rageval and RAG-evaluation-harnesses

These are competitors, as both repositories provide an evaluation suite specifically designed for Retrieval-Augmented Generation (RAG) methods, making them alternative choices for the same purpose.

rageval
36
Emerging
Maintenance 0/25
Adoption 10/25
Maturity 16/25
Community 10/25
Maintenance 2/25
Adoption 6/25
Maturity 16/25
Community 11/25
Stars: 170
Forks: 10
Downloads:
Commits (30d): 0
Language: Python
License: Apache-2.0
Stars: 23
Forks: 3
Downloads:
Commits (30d): 0
Language: Python
License: MIT
Stale 6m No Package No Dependents
Stale 6m No Package No Dependents

About rageval

gomate-community/rageval

Evaluation tools for Retrieval-augmented Generation (RAG) methods.

This tool helps evaluate the performance of your Retrieval-Augmented Generation (RAG) systems. It takes the outputs from various stages of your RAG pipeline—like rewritten queries, retrieved documents, and generated answers—and provides comprehensive scores on how well your system is performing across aspects like answer correctness, factual consistency, and document relevance. It is designed for AI/ML engineers or researchers building and refining RAG-based applications.

AI-evaluation NLP-benchmarking Generative-AI-testing LLM-performance Information-retrieval-quality

About RAG-evaluation-harnesses

RulinShao/RAG-evaluation-harnesses

An evaluation suite for Retrieval-Augmented Generation (RAG).

This project helps evaluate how well your Retrieval-Augmented Generation (RAG) system performs on various question-answering tasks. You provide your RAG model's retrieved documents and the questions, and it outputs performance scores. This tool is for researchers, developers, or MLOps engineers who are building and fine-tuning RAG systems and need to rigorously benchmark their effectiveness.

RAG-evaluation LLM-benchmarking NLP-research AI-model-testing information-retrieval

Scores updated daily from GitHub, PyPI, and npm data. How scores work