CSLiJT/awesome-lm-evaluation-methodologies

Frontier papers in the evaluation methodologies of language models.

21
/ 100
Experimental

This resource provides a curated collection of research papers focused on evaluating large language models (LLMs). It helps AI researchers, machine learning engineers, and data scientists understand and apply the latest methods for assessing LLM performance, reliability, and safety. You input specific keywords related to evaluation, and it provides links to relevant academic papers.

No commits in the last 6 months.

Use this if you are developing or working with large language models and need to find state-of-the-art methods and benchmarks to evaluate their capabilities.

Not ideal if you are a non-technical user looking for a simple tool to assess an LLM's output without delving into academic research papers.

AI research machine learning engineering natural language processing LLM evaluation AI model assessment
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 5 / 25
Maturity 16 / 25
Community 0 / 25

How are scores calculated?

Stars

10

Forks

Language

License

MIT

Last pushed

Oct 14, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/CSLiJT/awesome-lm-evaluation-methodologies"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.