open-compass/ANAH

[ACL 2024] ANAH & [NeurIPS 2024] ANAH-v2 & [ICLR 2025] Mask-DPO

/ 100

Emerging

This project helps anyone evaluating or building large language models (LLMs) to identify and reduce 'hallucinations' — instances where LLMs generate factually incorrect information. It takes LLM-generated text as input and provides detailed annotations about hallucinated sentences, along with a factuality score. The output can be used to improve the reliability of LLMs or to compare the performance of different models.

No commits in the last 6 months.

Use this if you need to precisely measure the factual accuracy of LLM responses, fine-tune an LLM to reduce its tendency to hallucinate, or benchmark different LLMs for factual correctness.

Not ideal if you are looking for a simple, out-of-the-box solution for general text generation without a focus on factuality or advanced LLM development.

LLM-evaluation AI-safety natural-language-processing fact-checking generative-AI

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 8 / 25

Maturity 16 / 25

Community 6 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

THU-BPM/MarkLLM

MarkLLM: An Open-Source Toolkit for LLM Watermarking.（EMNLP 2024 System Demonstration)

git-disl/Vaccine

This is the official code for the paper "Vaccine: Perturbation-aware Alignment for Large...

zjunlp/Deco

[ICLR 2025] MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation

HillZhang1999/ICD

Code & Data for our Paper "Alleviating Hallucinations of Large Language Models through Induced...

voidism/DoLa

Official implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality...

Explore Transformer Models

All categories Trending Transformer directory Insights