Shivanshu-Gupta/Visual-Question-Answering
CNN+LSTM, Attention based, and MUTAN-based models for Visual Question Answering
This system helps you automatically answer natural language questions about images. You provide an image and a question like "What color is the car?" or "How many people are in the picture?", and it generates a natural language answer based on the image content. This is for researchers or data scientists who need to analyze or extract information from large image collections using textual queries.
No commits in the last 6 months.
Use this if you need to build or experiment with AI models that can interpret visual information from images and provide textual answers to related questions.
Not ideal if you're looking for a ready-to-use, production-grade application for general image search or descriptive captioning, as this is a research-oriented toolkit.
Stars
79
Forks
19
Language
Python
License
—
Category
Last pushed
Jan 19, 2020
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/Shivanshu-Gupta/Visual-Question-Answering"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
asahi417/lm-question-generation
Multilingual/multidomain question generation datasets, models, and python library for question...
SparkJiao/SLQA
An Unofficial Pytorch Implementation of Multi-Granularity Hierarchical Attention Fusion Networks...
MurtyShikhar/Question-Answering
TensorFlow implementation of Match-LSTM and Answer pointer for the popular SQuAD dataset.
hsinyuan-huang/FlowQA
Implementation of conversational QA model: FlowQA (with slight improvement)
allenai/aokvqa
Official repository for the A-OKVQA dataset