Amirhosein-gh98/Gnosis

Can LLMs Predict Their Own Failures? Self-Awareness via Internal Circuits

35
/ 100
Emerging

This tool helps AI practitioners and researchers evaluate the reliability of responses from large language models (LLMs). By attaching a lightweight 'self-awareness head' to an existing LLM, it predicts a numerical probability of correctness for each generated answer. This allows users to understand how confident the LLM is in its own output, providing a crucial metric for tasks like question answering, content generation, and summarization.

Use this if you need to assess the trustworthiness of an LLM's output and want to automatically flag potentially incorrect answers.

Not ideal if you are looking for a tool to improve the LLM's core answer generation ability itself, as this focuses on evaluating existing outputs.

LLM evaluation AI reliability Natural Language Processing Model confidence AI research
No License No Package No Dependents
Maintenance 6 / 25
Adoption 7 / 25
Maturity 5 / 25
Community 17 / 25

How are scores calculated?

Stars

32

Forks

9

Language

Python

License

Last pushed

Jan 08, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/Amirhosein-gh98/Gnosis"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.