shikiw/OPERA

[CVPR 2024 Highlight] OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation

/ 100

Emerging

This project helps developers of Multi-Modal Large Language Models (MLLMs) reduce "hallucinations" – instances where the model generates inaccurate or made-up information when describing images. By modifying the model's decoding process, it takes an MLLM with an image and a text prompt as input and produces a more accurate, less hallucinatory text response without needing extra training data or external knowledge. It's designed for researchers and developers working on enhancing the reliability of MLLMs.

399 stars. No commits in the last 6 months.

Use this if you are a developer working with Multi-Modal Large Language Models (MLLMs) and need a method to reduce the generation of incorrect or fabricated details in their text outputs based on image inputs, without additional training or external data.

Not ideal if you are an end-user of an MLLM and not involved in its development or fine-tuning, or if you need to mitigate hallucinations using external knowledge bases or specialized training data.

multi-modal AI large language models AI model development computer vision natural language generation

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 15 / 25

How are scores calculated?

Stars

399

Forks

Language

Python

License

MIT

Higher-rated alternatives

TinyLLaVA/TinyLLaVA_Factory

A Framework of Small-scale Large Multimodal Models

zjunlp/EasyInstruct

[ACL 2024] An Easy-to-use Instruction Processing Framework for LLMs.

rese1f/MovieChat

[CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding

haotian-liu/LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

NVlabs/Eagle

Eagle: Frontier Vision-Language Models with Data-Centric Strategies

Explore Transformer Models

All categories Trending Transformer directory Insights