Cloud-CV/VQA
CloudCV Visual Question Answering Demo
This project helps you ask questions about an image and get descriptive answers. You provide an image and a question, and it generates a relevant text response. This is ideal for anyone who needs to quickly extract specific information or descriptions directly from images.
No commits in the last 6 months.
Use this if you need to understand the content of an image by posing natural language questions.
Not ideal if you need to generate images, detect objects, or perform complex image editing tasks.
Stars
67
Forks
24
Language
Lua
License
—
Category
Last pushed
Nov 04, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/Cloud-CV/VQA"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
open-mmlab/mmpretrain
OpenMMLab Pre-training Toolbox and Benchmark
facebookresearch/mmf
A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
adambielski/siamese-triplet
Siamese and triplet networks with online pair/triplet mining in PyTorch
HuaizhengZhang/Awsome-Deep-Learning-for-Video-Analysis
Papers, code and datasets about deep learning and multi-modal learning for video analysis
KaiyangZhou/pytorch-vsumm-reinforce
Unsupervised video summarization with deep reinforcement learning (AAAI'18)