genglinliu/UnknownBench
Repo for paper: Examining LLMs' Uncertainty Expression Towards Questions Outside Parametric Knowledge
This project helps evaluate how well large language models (LLMs) recognize and express uncertainty when faced with questions beyond their trained knowledge. It takes a set of questions (some with false premises or about non-existent concepts) and processes them through various LLMs. The output shows how confidently or uncertainly the LLMs respond, helping researchers understand and improve LLM reliability.
No commits in the last 6 months.
Use this if you are an AI researcher or practitioner interested in assessing and improving the trustworthiness of large language models by understanding their uncertainty expression.
Not ideal if you are looking for a tool to fine-tune LLMs for specific tasks or to generate new text content.
Stars
14
Forks
—
Language
Jupyter Notebook
License
—
Category
Last pushed
Feb 20, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/genglinliu/UnknownBench"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
cvs-health/uqlm
UQLM: Uncertainty Quantification for Language Models, is a Python package for UQ-based LLM...
PRIME-RL/TTRL
[NeurIPS 2025] TTRL: Test-Time Reinforcement Learning
sapientinc/HRM
Hierarchical Reasoning Model Official Release
tigerchen52/query_level_uncertainty
query-level uncertainty in LLMs
reasoning-survey/Awesome-Reasoning-Foundation-Models
✨✨Latest Papers and Benchmarks in Reasoning with Foundation Models