batra-mlp-lab/visdial
[CVPR 2017] Torch code for Visual Dialog
This project helps create AI agents that can hold natural conversations about images. Given an image, the conversation history, and a new question, the AI generates an appropriate answer. It's designed for researchers and AI developers working on building sophisticated visual assistants or interactive image-understanding systems.
230 stars. No commits in the last 6 months.
Use this if you need to train and evaluate AI models capable of engaging in a question-and-answer dialog about visual content.
Not ideal if you're looking for a ready-to-use, deployable visual dialog system without custom development or advanced model training knowledge.
Stars
230
Forks
69
Language
Lua
License
—
Category
Last pushed
Nov 29, 2018
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/batra-mlp-lab/visdial"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
open-mmlab/mmpretrain
OpenMMLab Pre-training Toolbox and Benchmark
facebookresearch/mmf
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
adambielski/siamese-triplet
Siamese and triplet networks with online pair/triplet mining in PyTorch
HuaizhengZhang/Awsome-Deep-Learning-for-Video-Analysis
Papers, code and datasets about deep learning and multi-modal learning for video analysis
KaiyangZhou/pytorch-vsumm-reinforce
Unsupervised video summarization with deep reinforcement learning (AAAI'18)