genglinliu/UnknownBench

Repo for paper: Examining LLMs' Uncertainty Expression Towards Questions Outside Parametric Knowledge

13
/ 100
Experimental

This project helps evaluate how well large language models (LLMs) recognize and express uncertainty when faced with questions beyond their trained knowledge. It takes a set of questions (some with false premises or about non-existent concepts) and processes them through various LLMs. The output shows how confidently or uncertainly the LLMs respond, helping researchers understand and improve LLM reliability.

No commits in the last 6 months.

Use this if you are an AI researcher or practitioner interested in assessing and improving the trustworthiness of large language models by understanding their uncertainty expression.

Not ideal if you are looking for a tool to fine-tune LLMs for specific tasks or to generate new text content.

AI research LLM evaluation model reliability AI safety natural language processing
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 5 / 25
Maturity 8 / 25
Community 0 / 25

How are scores calculated?

Stars

14

Forks

Language

Jupyter Notebook

License

Last pushed

Feb 20, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/genglinliu/UnknownBench"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.