jeinlee1991/chinese-llm-benchmark
ReLE评测:中文AI大模型能力评测(持续更新):目前已囊括359个大模型,覆盖chatgpt、gpt-5.2、o4-mini、谷歌gemini-3-pro、Claude-4.6、文心ERNIE-X1.1、ERNIE-5.0、qwen3-max、qwen3.5-plus、百川、讯飞星火、商汤senseChat等商用模型, 以及step3.5-flash、kimi-k2.5、ernie4.5、MiniMax-M2.5、deepseek-v3.2、Qwen3.5、llama4、智谱GLM-5、GLM-4.7、LongCat、gemma3、mistral等开源大模型。不仅提供排行榜,也提供规模超200万的大模型缺陷库!方便广大社区研究分析、改进大模型。
This project provides comprehensive evaluations of Chinese large language models (LLMs), covering both commercial and open-source options across diverse fields like education, finance, healthcare, and law. You input a list of Chinese LLMs you are considering, and it outputs detailed performance rankings and a database of known defects for each model. This is for anyone who needs to select or improve Chinese LLMs for specific applications, such as product managers, researchers, or business strategists.
5,675 stars. Actively maintained with 9 commits in the last 30 days.
Use this if you need to understand the strengths and weaknesses of various Chinese large language models across multiple application domains.
Not ideal if you are looking for evaluations of non-Chinese LLMs or if your primary need is for a general-purpose LLM development and observability platform.
Stars
5,675
Forks
229
Language
—
License
—
Category
Last pushed
Mar 07, 2026
Commits (30d)
9
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/jeinlee1991/chinese-llm-benchmark"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Recent Releases
Related tools
bvobart/mllint
`mllint` is a command-line utility to evaluate the technical quality of Python Machine Learning...
ApextheBoss/canary
🐤 Know when your LLM provider silently degrades. Automated quality testing for AI models. Like...
Software-Engineering-Arena/SWE-Chatbot-Arena
Compare chatbots pairwise via multi‑round evaluations for SE tasks.
oolong-tea-2026/arena-ai-leaderboards
📊 Daily auto-updated snapshots of all Arena AI (LMSYS Chatbot Arena) leaderboards — LLM, Vision,...
abject-milkingmachine273/llm-cost-dashboard
Monitor LLM token costs in real time with a terminal dashboard offering per-request tracking,...