LLM Hallucination Mitigation LLM Tools
Tools and techniques for detecting, measuring, and correcting hallucinations in large language models across text and multimodal outputs. Does NOT include general LLM evaluation, factuality benchmarks, or non-hallucination-specific safety measures.
There are 34 llm hallucination mitigation tools tracked. 1 score above 50 (established tier). The highest-rated is vectara/hallucination-leaderboard at 55/100 with 3,122 stars. 1 of the top 10 are actively maintained.
Get all 34 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=llm-tools&subcategory=llm-hallucination-mitigation&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
vectara/hallucination-leaderboard
Leaderboard Comparing LLM Performance at Producing Hallucinations when... |
|
Established |
| 2 |
PKU-YuanGroup/Hallucination-Attack
Attack to induce LLMs within hallucinations |
|
Emerging |
| 3 |
amir-hameed-mir/Sirraya_LSD_Code
Layer-wise Semantic Dynamics (LSD) is a model-agnostic framework for... |
|
Emerging |
| 4 |
NishilBalar/Awesome-LVLM-Hallucination
up-to-date curated list of state-of-the-art Large vision language models... |
|
Emerging |
| 5 |
intuit/sac3
Official repo for SAC3: Reliable Hallucination Detection in Black-Box... |
|
Emerging |
| 6 |
HillZhang1999/llm-hallucination-survey
Reading list of hallucination in LLMs. Check out our new survey paper:... |
|
Emerging |
| 7 |
Amirhosein-gh98/Gnosis
Can LLMs Predict Their Own Failures? Self-Awareness via Internal Circuits |
|
Emerging |
| 8 |
OpenMOSS/HalluQA
Dataset and evaluation script for "Evaluating Hallucinations in Chinese... |
|
Emerging |
| 9 |
MemTensor/HaluMem
HaluMem is the first operation level hallucination evaluation benchmark... |
|
Emerging |
| 10 |
hongcheki/sweet-watermark
Official repository of the paper: Who Wrote this Code? Watermarking for Code... |
|
Emerging |
| 11 |
plll4zzx/Awesome-LLM-Watermark
A collection list for Large Language Model (LLM) Watermark |
|
Emerging |
| 12 |
VITA-MLLM/Woodpecker
✨✨Woodpecker: Hallucination Correction for Multimodal Large Language Models |
|
Emerging |
| 13 |
hzy312/Awesome-LLM-Watermark
UP-TO-DATE LLM Watermark paper. 🔥🔥🔥 |
|
Emerging |
| 14 |
zjunlp/FactCHD
[IJCAI 2024] FactCHD: Benchmarking Fact-Conflicting Hallucination Detection |
|
Experimental |
| 15 |
oumi-ai/halloumi-demo
Try out HallOumi, a state-of-the-art claim verification model in a simple UI! |
|
Experimental |
| 16 |
hongbinye/Cognitive-Mirage-Hallucinations-in-LLMs
Repository for the paper "Cognitive Mirage: A Review of Hallucinations in... |
|
Experimental |
| 17 |
Mattbusel/LLM-Hallucination-Detection-Script
A comprehensive toolkit for detecting potential hallucinations in LLM... |
|
Experimental |
| 18 |
Intelligent-Computing-Research-Group/HaVen
[DATE 2025] haven: hallucination-mitigated llm for verilog code generation... |
|
Experimental |
| 19 |
10nc0/Nyan-Protocol
Hallucination guard for AI — one invariant, any model, no training required. |
|
Experimental |
| 20 |
lilakk/PostMark
Official repository for "PostMark: A Robust Blackbox Watermark for Large... |
|
Experimental |
| 21 |
IAAR-Shanghai/ICSFSurvey
Explore concepts like Self-Correct, Self-Refine, Self-Improve,... |
|
Experimental |
| 22 |
hallucinatemd/hallucinate.md
The open standard for telling AI not to hallucinate. |
|
Experimental |
| 23 |
kjgpta/WhoDunIt-Evaluation_benchmark_for_culprit_detection_in_mystery_stories
WHODUNIT is a benchmark repository for evaluating large language models'... |
|
Experimental |
| 24 |
ruisizhang123/REMARK-LLM
[USENIX Security'24] REMARK-LLM: A robust and efficient watermarking... |
|
Experimental |
| 25 |
lasithadilshan/Hallucination-Detector-App
A Hallucination Detection Tool powered by UQML, designed to identify whether... |
|
Experimental |
| 26 |
pranav-kural/llm-hallucination-detection-service
Build your own open-source REST API endpoint to detect hallucination in LLM... |
|
Experimental |
| 27 |
serhanylmz/pas2
PAS2: A Python-based hallucination detection system that evaluates AI... |
|
Experimental |
| 28 |
DegenAI-Labs/HalluWorld
Repository for the paper "A Unified Definition of Hallucination: It’s The... |
|
Experimental |
| 29 |
akborsusom/watermark-ai-analysis
Reproduction and attack analysis of LLM text watermarking (Kirchenbauer et... |
|
Experimental |
| 30 |
141forever/DiaHalu
This is the repository for the paper 'DiaHalu: A Dialogue-level... |
|
Experimental |
| 31 |
tranhoangtu-it/halluciguard-api
HalluciGuard API — AI Hallucination Firewall as a Service. Detect and filter... |
|
Experimental |
| 32 |
strayfear/HalluWorld
🌍 Explore the HalluWorld project, a benchmark for understanding and defining... |
|
Experimental |
| 33 |
IAAR-Shanghai/UHGEval-dataset
The full pipeline of creating UHGEval hallucination dataset |
|
Experimental |
| 34 |
amarquaye/atlas
🔢Hallucination detector for Large Language Models. |
|
Experimental |