CLUEbenchmark/CLUE
中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
This project offers a comprehensive benchmark for evaluating and comparing the performance of Chinese language understanding models. It provides standardized datasets for various tasks, pre-trained models, and a public leaderboard. This helps researchers and AI developers assess the capabilities of different models on real-world Chinese text understanding challenges.
4,237 stars.
Use this if you are developing or researching AI models for Chinese language processing and need a standardized way to evaluate their performance on tasks like text classification, natural language inference, and reading comprehension.
Not ideal if you are looking for an off-the-shelf application to directly solve a business problem or if your primary focus is on languages other than Chinese.
Stars
4,237
Forks
546
Language
Python
License
—
Category
Last pushed
Feb 06, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/CLUEbenchmark/CLUE"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
shibing624/MedicalGPT
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline....
lyogavin/airllm
AirLLM 70B inference with single 4GB GPU
GradientHQ/parallax
Parallax is a distributed model serving framework that lets you build your own AI cluster anywhere
CrazyBoyM/llama3-Chinese-chat
Llama3、Llama3.1 中文后训练版仓库 - 微调、魔改版本有趣权重 & 训练、推理、评测、部署教程视频 & 文档。
MediaBrain-SJTU/MING
明医 (MING):中文医疗问诊大模型