rabiulcste/vqazero

visual question answering prompting recipes for large vision-language models

/ 100

Experimental

This project helps researchers and developers explore how to make vision-language models better at answering questions about images without extensive fine-tuning. By feeding an image and a question into various models, it generates improved text answers, enabling more accurate visual question answering. It's designed for AI researchers and practitioners working with advanced visual AI.

No commits in the last 6 months.

Use this if you are an AI researcher or developer experimenting with advanced vision-language models and want to evaluate different prompting strategies for zero- or few-shot visual question answering tasks.

Not ideal if you need a simple, out-of-the-box solution for basic image captioning or if you are not comfortable working with command-line interfaces for model inference.

visual question answering large vision models prompt engineering AI research computer vision

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 7 / 25

Maturity 8 / 25

Community 12 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

—

Higher-rated alternatives

ShiZhengyan/PowerfulPromptFT

[NeurIPS 2023 Main Track] This is the repository for the paper titled "Don’t Stop Pretraining?...

OpenDriveLab/DriveLM

[ECCV 2024 Oral] DriveLM: Driving with Graph Visual Question Answering

MILVLG/prophet

Implementation of CVPR 2023 paper "Prompting Large Language Models with Answer Heuristics for...

deepankar27/Prompt_Organizer

Managed Prompt Engineering

mala-lab/NegPrompt

The official implementation of CVPR 24' Paper "Learning Transferable Negative Prompts for...

Explore Prompt Engineering Tools

All categories Trending Prompt Engineering directory Insights