CLUEbenchmark/SuperCLUE

SuperCLUE: 中文通用大模型综合性基准 | A Benchmark for Foundation Models in Chinese

44
/ 100
Emerging

Need to understand how well large language models (LLMs) perform on tasks relevant to the Chinese language and culture? SuperCLUE provides comprehensive evaluations across key capabilities like language understanding, professional knowledge, and AI agent performance. It helps compare different Chinese LLMs and understand their strengths and weaknesses, offering a ranked list of models and detailed breakdowns of their abilities.

3,277 stars.

Use this if you need to compare or select Chinese large language models based on their performance across a wide range of practical applications and specialized skills.

Not ideal if you are looking for an evaluation of non-Chinese language models or highly specialized, niche technical benchmarks not related to general LLM capabilities.

AI-model-evaluation Chinese-language-AI LLM-benchmarking natural-language-processing AI-agent-performance
No License No Package No Dependents
Maintenance 10 / 25
Adoption 10 / 25
Maturity 8 / 25
Community 16 / 25

How are scores calculated?

Stars

3,277

Forks

112

Language

License

Last pushed

Feb 06, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/CLUEbenchmark/SuperCLUE"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.