LaVi-Lab/CLEVA

[EMNLP 2023 Demo] "CLEVA: Chinese Language Models EVAluation Platform"

32
/ 100
Emerging

This platform helps you accurately assess the performance of Chinese Large Language Models (LLMs). You input a Chinese LLM you want to test, and it outputs detailed evaluation results across 31 tasks like summarization, translation, and fact-checking, along with a trustworthy leaderboard. This is ideal for researchers, developers, or businesses working with Chinese natural language processing who need to benchmark and compare different models.

No commits in the last 6 months.

Use this if you need a comprehensive and standardized way to evaluate Chinese LLMs, minimizing issues like data contamination.

Not ideal if you are looking to evaluate non-Chinese language models or if you need a solution without using the HELM framework for local evaluations.

Chinese-NLP LLM-evaluation model-benchmarking natural-language-processing AI-research
Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 8 / 25
Maturity 16 / 25
Community 6 / 25

How are scores calculated?

Stars

64

Forks

3

Language

Shell

License

Last pushed

May 16, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/LaVi-Lab/CLEVA"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.