hallucination-leaderboard and Awesome-LVLM-Hallucination

These are ecosystem siblings: one benchmarks hallucination behavior in text-based LLMs while the other curates research resources for vision-language model hallucinations, addressing related problems across different modalities within the broader hallucination mitigation domain.

Maintenance 13/25
Adoption 10/25
Maturity 16/25
Community 16/25
Maintenance 10/25
Adoption 10/25
Maturity 8/25
Community 11/25
Stars: 3,122
Forks: 96
Downloads:
Commits (30d): 3
Language: Python
License: Apache-2.0
Stars: 283
Forks: 15
Downloads:
Commits (30d): 0
Language:
License:
No Package No Dependents
No License No Package No Dependents

About hallucination-leaderboard

vectara/hallucination-leaderboard

Leaderboard Comparing LLM Performance at Producing Hallucinations when Summarizing Short Documents

When you're trying to pick the best Large Language Model (LLM) for summarizing documents, this leaderboard helps you evaluate their reliability. It shows how often different LLMs invent information (hallucinate) when summarizing texts, using Vectara's specialized evaluation model. This is useful for anyone who relies on LLMs for accurate content generation, such as content creators, researchers, or data analysts.

LLM evaluation content summarization fact-checking AI reliability natural language processing

About Awesome-LVLM-Hallucination

NishilBalar/Awesome-LVLM-Hallucination

up-to-date curated list of state-of-the-art Large vision language models hallucinations research work, papers & resources

When working with Large Vision Language Models (LVLMs), also known as Multimodal Large Language Models (MLLMs), you might encounter 'hallucinations' where the model generates text describing things not present in the visual input. This resource provides an organized collection of state-of-the-art research papers, code, and descriptions related to detecting and mitigating these LVLM hallucinations. It's for researchers, developers, or practitioners who are building, evaluating, or deploying LVLMs and need to address their reliability.

Large Vision Language Models Multimodal AI AI Reliability AI Evaluation Generative AI

Scores updated daily from GitHub, PyPI, and npm data. How scores work