MLGroupJLU/LLM-eval-survey

The official GitHub page for the survey paper "A Survey on Evaluation of Large Language Models".

37
/ 100
Emerging

This resource provides a curated collection of research papers and materials focused on evaluating Large Language Models (LLMs). It helps researchers and practitioners understand various aspects of LLM performance, covering topics from natural language processing tasks like sentiment analysis and reasoning, to robustness and ethical considerations. The collection is organized to help users quickly find relevant studies on how LLMs are assessed.

1,591 stars. No commits in the last 6 months.

Use this if you are an AI researcher, LLM developer, or academic looking for a comprehensive overview of current research and benchmarks on evaluating Large Language Models.

Not ideal if you are looking for a practical guide on how to evaluate a specific LLM, or if you need code implementations for evaluation metrics.

AI-research LLM-evaluation natural-language-processing machine-learning-research academic-research
No License Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 10 / 25
Maturity 8 / 25
Community 17 / 25

How are scores calculated?

Stars

1,591

Forks

100

Language

License

Last pushed

Jun 03, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/MLGroupJLU/LLM-eval-survey"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.