ChanLiang/CONNER
[EMNLP 2023] Beyond Factuality: A Comprehensive Evaluation of Large Language Models as Knowledge Generators
This project provides a systematic way to evaluate how well Large Language Models (LLMs) generate knowledge. You input the text produced by an LLM and it provides scores across various dimensions like factuality, relevance, and helpfulness. It's designed for researchers or practitioners who are developing or deploying LLMs and need to rigorously assess their quality.
No commits in the last 6 months.
Use this if you are developing or using Large Language Models and need a comprehensive, multi-faceted evaluation of the knowledge they generate beyond just factual accuracy.
Not ideal if you are looking for a simple, single-metric 'pass/fail' evaluation or if you are not working directly with LLM outputs.
Stars
33
Forks
2
Language
Python
License
—
Category
Last pushed
Jan 22, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/ChanLiang/CONNER"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
PaddlePaddle/PaddleNLP
Easy-to-use and powerful LLM and SLM library with awesome model zoo.
meta-llama/llama-cookbook
Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started...
arcee-ai/mergekit
Tools for merging pretrained large language models.
changyeyu/LLM-RL-Visualized
๐100+ ๅๅ LLM / RL ๅ็ๅพ๐๏ผใๅคงๆจกๅ็ฎๆณใไฝ่ ๅทจ็ฎ๏ผ๐ฅ๏ผ100+ LLM/RL Algorithm Maps ๏ผ
mindspore-lab/step_into_llm
MindSpore online courses: Step into LLM