sciknoworg/YESciEval
YESciEval: Robust LLM-as-a-Judge for Scientific Question Answering https://pypi.org/project/YESciEval/
This tool helps researchers, scientists, and content creators evaluate the quality of answers generated by AI for scientific questions. You input a scientific question and the AI-generated answer, and it provides a detailed assessment based on predefined scientific rubrics like correctness, informativeness, and coherence. This is ideal for anyone working with AI in scientific research, education, or content generation to ensure accuracy and reliability.
Available on PyPI.
Use this if you need to objectively assess the quality and scientific rigor of AI-generated answers in fields like biomedicine or multidisciplinary research.
Not ideal if you are evaluating general knowledge answers or creative writing, as its rubrics are specifically designed for scientific accuracy and understanding.
Stars
10
Forks
1
Language
Python
License
MIT
Category
Last pushed
Mar 06, 2026
Commits (30d)
0
Dependencies
8
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/sciknoworg/YESciEval"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Featured in
Higher-rated alternatives
EvolvingLMMs-Lab/lmms-eval
One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
vibrantlabsai/ragas
Supercharge Your LLM Application Evaluations 🚀
open-compass/VLMEvalKit
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
EuroEval/EuroEval
The robust European language model benchmark.
Giskard-AI/giskard-oss
🐢 Open-Source Evaluation & Testing library for LLM Agents