visual-haystacks/mirage
🔥 [ICLR 2025] Official PyTorch Model "Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark"
This tool helps researchers and developers working with Large Multimodal Models (LMMs) to overcome the challenge of analyzing and answering questions across vast numbers of images, often tens of thousands. It takes a collection of images and a question as input, then provides an accurate answer by effectively identifying and utilizing relevant visual information. This is ideal for those who need to extract insights from very large visual datasets where existing LMMs struggle.
No commits in the last 6 months.
Use this if you need an LMM to answer questions by sifting through and reasoning across tens of thousands of images, where traditional models fail due to scale or context limitations.
Not ideal if your task involves only a small number of images or if you require fine-grained analysis of individual images rather than broad reasoning across large visual datasets.
Stars
26
Forks
2
Language
Python
License
MIT
Category
Last pushed
Feb 09, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/computer-vision/visual-haystacks/mirage"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
EvolvingLMMs-Lab/EgoLife
[CVPR 2025] EgoLife: Towards Egocentric Life Assistant
Devanik21/xylia-vision
Vision transformer-powered knowledge extraction. Analyze any image: botanical taxonomy, cultural...
anishalle/YOLO
You Only Look Once, fine-tuned LLM + scene graph reasoning used for navigation by visually...