kaylode/vqa-transformer

Visual Question Answering using Transformer and Bottom-Up attention. Implemented in Pytorch

28
/ 100
Experimental

This project helps systems understand images by answering natural language questions about them. It takes an image and a text question as input, then outputs a concise, one-word answer. It's designed for AI researchers and machine learning engineers developing or evaluating advanced image understanding capabilities.

No commits in the last 6 months.

Use this if you are an AI researcher or machine learning engineer exploring how Transformer architectures and bottom-up attention perform on Visual Question Answering tasks.

Not ideal if you need a production-ready, highly accurate VQA system for a broad range of real-world applications or multi-word answers.

visual-question-answering computer-vision natural-language-processing AI-research image-understanding
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 5 / 25
Maturity 16 / 25
Community 7 / 25

How are scores calculated?

Stars

10

Forks

1

Language

Python

License

MIT

Last pushed

Oct 11, 2021

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/kaylode/vqa-transformer"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.