visual-haystacks/mirage

🔥 [ICLR 2025] Official PyTorch Model "Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark"

/ 100

Emerging

This tool helps researchers and developers working with Large Multimodal Models (LMMs) to overcome the challenge of analyzing and answering questions across vast numbers of images, often tens of thousands. It takes a collection of images and a question as input, then provides an accurate answer by effectively identifying and utilizing relevant visual information. This is ideal for those who need to extract insights from very large visual datasets where existing LMMs struggle.

No commits in the last 6 months.

Use this if you need an LMM to answer questions by sifting through and reasoning across tens of thousands of images, where traditional models fail due to scale or context limitations.

Not ideal if your task involves only a small number of images or if you require fine-grained analysis of individual images rather than broad reasoning across large visual datasets.

large-scale image analysis visual question answering multimodal AI research LMM development big data vision

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 7 / 25

Maturity 16 / 25

Community 7 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

EvolvingLMMs-Lab/EgoLife

[CVPR 2025] EgoLife: Towards Egocentric Life Assistant

Devanik21/xylia-vision

Vision transformer-powered knowledge extraction. Analyze any image: botanical taxonomy, cultural...

anishalle/YOLO

You Only Look Once, fine-tuned LLM + scene graph reasoning used for navigation by visually...

Explore Computer Vision Tools

All categories Trending Computer Vision directory Insights