JindongGu/Awesome-Prompting-on-Vision-Language-Model

This repo lists relevant papers summarized in our survey paper: A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models.

33
/ 100
Emerging

This resource helps AI researchers and practitioners understand and apply 'prompt engineering' to Vision-Language Models (VLMs). It provides a curated collection of research papers that detail how to fine-tune existing multimodal-to-text, image-text matching, and text-to-image generation models for new tasks. Researchers can use this to quickly find relevant studies and techniques to improve their VLM applications.

509 stars. No commits in the last 6 months.

Use this if you are a researcher or AI practitioner exploring the cutting-edge of prompt engineering for Vision-Language Models and need a curated list of influential papers and their classifications.

Not ideal if you are looking for an introductory guide to large language models or a hands-on coding tutorial for VLM prompting, as this is a research-focused survey.

AI Research Machine Learning Engineering Computer Vision Natural Language Processing Generative AI
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 8 / 25
Community 15 / 25

How are scores calculated?

Stars

509

Forks

40

Language

License

Last pushed

Mar 18, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/prompt-engineering/JindongGu/Awesome-Prompting-on-Vision-Language-Model"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.