rungalileo/hallucination-index

Initiative to evaluate and rank the most popular LLMs across common task types based on their propensity to hallucinate.

/ 100

Emerging

This index helps you identify which large language models (LLMs) are most reliable and least likely to invent information, a problem known as 'hallucination.' It provides rankings of popular LLMs across different tasks and context lengths, showing you which models perform best. If you're building applications that use LLMs and rely on factual accuracy, this index helps you choose the best foundational model for your needs.

116 stars. No commits in the last 6 months.

Use this if you are selecting a large language model for a Retrieval Augmented Generation (RAG) system and need to minimize the risk of the model providing incorrect or fabricated information.

Not ideal if your primary concern is model speed, cost, or general creative generation rather than factual accuracy from provided context.

LLM evaluation AI model selection RAG system design AI risk management language model accuracy

No License Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 10 / 25

Maturity 8 / 25

Community 11 / 25

How are scores calculated?

Stars

116

Forks

Language

—

License

—

Higher-rated alternatives

onestardao/WFGY

WFGY: open-source reasoning and debugging infrastructure for RAG and AI agents. Includes the...

KRLabsOrg/verbatim-rag

Hallucination-prevention RAG system with verbatim span extraction. Ensures all generated content...

iMoonLab/Hyper-RAG

"Hyper-RAG: Combating LLM Hallucinations using Hypergraph-Driven Retrieval-Augmented Generation"...

frmoretto/clarity-gate

Stop LLMs from hallucinating your guesses as facts. Clarity Gate is a verification protocol for...

project-miracl/nomiracl

NoMIRACL: A multilingual hallucination evaluation dataset to evaluate LLM robustness in RAG...

Explore RAG Tools

All categories Trending RAG directory Insights