MILVLG/prophet

Implementation of CVPR 2023 paper "Prompting Large Language Models with Answer Heuristics for Knowledge-based Visual Question Answering".

43
/ 100
Emerging

This project helps researchers and AI practitioners improve how large language models (LLMs) answer questions about images, especially when the answers require external knowledge. It takes an image and a question as input, uses a specialized process to identify potential answers and relevant examples, and then prompts a powerful LLM like GPT-3 to produce more accurate and nuanced answers. Anyone working on advanced visual question answering (VQA) systems for research or specialized applications would use this.

279 stars. No commits in the last 6 months.

Use this if you need to significantly boost the accuracy of visual question answering systems that rely on external knowledge beyond what's directly visible in an image.

Not ideal if you are looking for a simple, off-the-shelf image captioning tool or a VQA system that does not require integrating knowledge from large language models.

Visual Question Answering Large Language Models AI Research Computer Vision Knowledge-based AI
Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 15 / 25

How are scores calculated?

Stars

279

Forks

28

Language

Python

License

Apache-2.0

Last pushed

Jun 14, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/prompt-engineering/MILVLG/prophet"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.