vtu81/NaiveVQA

A Visual Question Answering model implemented in MindSpore and PyTorch. The model is a reimplementation of the paper *Show, Ask, Attend, and Answer: A Strong Baseline For Visual Question Answering*. It's our final project for course DL4NLP at ZJU.

28
/ 100
Experimental

This project helps you build a system that can answer questions about images, similar to how a human would. You provide it with a collection of images and a list of questions related to those images, and it outputs the predicted answers. This is useful for researchers and machine learning engineers working on artificial intelligence that can interpret visual information and language.

No commits in the last 6 months.

Use this if you are a machine learning researcher or engineer looking for a baseline model to understand or experiment with Visual Question Answering (VQA).

Not ideal if you need a production-ready VQA system for immediate deployment or if you don't have access to substantial computational resources like an Nvidia GPU.

visual-question-answering computer-vision natural-language-processing deep-learning-research AI-model-development
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 5 / 25
Maturity 8 / 25
Community 15 / 25

How are scores calculated?

Stars

10

Forks

4

Language

Jupyter Notebook

License

Last pushed

Jul 27, 2021

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/vtu81/NaiveVQA"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.