rabiulcste/vqazero

visual question answering prompting recipes for large vision-language models

27
/ 100
Experimental

This project helps researchers and developers explore how to make vision-language models better at answering questions about images without extensive fine-tuning. By feeding an image and a question into various models, it generates improved text answers, enabling more accurate visual question answering. It's designed for AI researchers and practitioners working with advanced visual AI.

No commits in the last 6 months.

Use this if you are an AI researcher or developer experimenting with advanced vision-language models and want to evaluate different prompting strategies for zero- or few-shot visual question answering tasks.

Not ideal if you need a simple, out-of-the-box solution for basic image captioning or if you are not comfortable working with command-line interfaces for model inference.

visual question answering large vision models prompt engineering AI research computer vision
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 7 / 25
Maturity 8 / 25
Community 12 / 25

How are scores calculated?

Stars

28

Forks

4

Language

Python

License

Last pushed

Sep 14, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/prompt-engineering/rabiulcste/vqazero"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.