genglinliu/UnknownBench

Repo for paper: Examining LLMs' Uncertainty Expression Towards Questions Outside Parametric Knowledge

/ 100

Experimental

This project helps evaluate how well large language models (LLMs) recognize and express uncertainty when faced with questions beyond their trained knowledge. It takes a set of questions (some with false premises or about non-existent concepts) and processes them through various LLMs. The output shows how confidently or uncertainly the LLMs respond, helping researchers understand and improve LLM reliability.

No commits in the last 6 months.

Use this if you are an AI researcher or practitioner interested in assessing and improving the trustworthiness of large language models by understanding their uncertainty expression.

Not ideal if you are looking for a tool to fine-tune LLMs for specific tasks or to generate new text content.

AI research LLM evaluation model reliability AI safety natural language processing

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 5 / 25

Maturity 8 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Jupyter Notebook

License

—

Higher-rated alternatives

cvs-health/uqlm

UQLM: Uncertainty Quantification for Language Models, is a Python package for UQ-based LLM...

PRIME-RL/TTRL

[NeurIPS 2025] TTRL: Test-Time Reinforcement Learning

sapientinc/HRM

Hierarchical Reasoning Model Official Release

tigerchen52/query_level_uncertainty

query-level uncertainty in LLMs

reasoning-survey/Awesome-Reasoning-Foundation-Models

✨✨Latest Papers and Benchmarks in Reasoning with Foundation Models

Explore Transformer Models

All categories Trending Transformer directory Insights