HICAI-ZJU/SciKnowEval

SciKnowEval: Evaluating Multi-level Scientific Knowledge of Large Language Models

27
/ 100
Experimental

This project helps evaluate how well large language models (LLMs) understand and apply scientific knowledge across various domains like Biology, Chemistry, Physics, and Materials Science. It takes an LLM's responses to scientific questions as input and provides a detailed assessment of its abilities, from recalling facts to complex reasoning and ethical discernment. Scientists, researchers, and AI developers can use this to benchmark and improve LLMs for scientific applications.

No commits in the last 6 months.

Use this if you need to thoroughly assess a large language model's capabilities in scientific contexts, especially its ability to remember, comprehend, reason, discern, and apply scientific knowledge.

Not ideal if you are looking for a general-purpose LLM evaluation that doesn't focus specifically on multi-level scientific knowledge or if your model is not designed for scientific tasks.

scientific-research LLM-evaluation AI-in-science knowledge-assessment scientific-AI
No License Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 7 / 25
Maturity 8 / 25
Community 10 / 25

How are scores calculated?

Stars

27

Forks

3

Language

Python

License

Last pushed

Jul 13, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/HICAI-ZJU/SciKnowEval"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.