jeinlee1991/chinese-llm-benchmark

ReLE评测：中文AI大模型能力评测（持续更新）：目前已囊括359个大模型，覆盖chatgpt、gpt-5.2、o4-mini、谷歌gemini-3-pro、Claude-4.6、文心ERNIE-X1.1、ERNIE-5.0、qwen3-max、qwen3.5-plus、百川、讯飞星火、商汤senseChat等商用模型，以及step3.5-flash、kimi-k2.5、ernie4.5、MiniMax-M2.5、deepseek-v3.2、Qwen3.5、llama4、智谱GLM-5、GLM-4.7、LongCat、gemma3、mistral等开源大模型。不仅提供排行榜，也提供规模超200万的大模型缺陷库！方便广大社区研究分析、改进大模型。

/ 100

Established

This project provides comprehensive evaluations of Chinese large language models (LLMs), covering both commercial and open-source options across diverse fields like education, finance, healthcare, and law. You input a list of Chinese LLMs you are considering, and it outputs detailed performance rankings and a database of known defects for each model. This is for anyone who needs to select or improve Chinese LLMs for specific applications, such as product managers, researchers, or business strategists.

5,675 stars. Actively maintained with 9 commits in the last 30 days.

Use this if you need to understand the strengths and weaknesses of various Chinese large language models across multiple application domains.

Not ideal if you are looking for evaluations of non-Chinese LLMs or if your primary need is for a general-purpose LLM development and observability platform.

AI model selection Chinese language processing model benchmarking AI research enterprise AI strategy

No License No Package No Dependents

Maintenance 17 / 25

Adoption 10 / 25

Maturity 8 / 25

Community 17 / 25

How are scores calculated?

Stars

5,675

Forks

229

Language

—

License

—

Recent Releases

v5.8.13 14 Feb 2026 v5.8.12 09 Feb 2026 v5.8.11 02 Feb 2026 v5.8.10 29 Jan 2026 v5.8.7 23 Dec 2025

Related tools

bvobart/mllint

`mllint` is a command-line utility to evaluate the technical quality of Python Machine Learning...

ApextheBoss/canary

🐤 Know when your LLM provider silently degrades. Automated quality testing for AI models. Like...

Software-Engineering-Arena/SWE-Chatbot-Arena

Compare chatbots pairwise via multi‑round evaluations for SE tasks.

oolong-tea-2026/arena-ai-leaderboards

📊 Daily auto-updated snapshots of all Arena AI (LMSYS Chatbot Arena) leaderboards — LLM, Vision,...

abject-milkingmachine273/llm-cost-dashboard

Monitor LLM token costs in real time with a terminal dashboard offering per-request tracking,...

Explore LLM Tools

All categories Trending LLM Tool directory Insights