IAAR-Shanghai/CRUD_RAG

CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language Models

34
/ 100
Emerging

This project helps developers and researchers evaluate the performance of Retrieval-Augmented Generation (RAG) systems with Chinese language data. You provide your RAG system, a large set of Chinese news documents, and the project outputs various metrics showing how well your RAG system retrieves relevant information and generates accurate, coherent responses. It's designed for AI/ML engineers, natural language processing researchers, and RAG system developers.

362 stars. No commits in the last 6 months.

Use this if you are building or researching RAG systems and need a robust benchmark to assess their capabilities, especially with Chinese text.

Not ideal if you are an end-user looking for a ready-to-use RAG application or if you are not comfortable with modifying code and setting up language model APIs.

RAG evaluation Chinese NLP LLM benchmarking Information Retrieval Generative AI development
No License Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 10 / 25
Maturity 8 / 25
Community 14 / 25

How are scores calculated?

Stars

362

Forks

28

Language

Python

License

Last pushed

May 20, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/rag/IAAR-Shanghai/CRUD_RAG"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.