ltguo19/VSUA-Captioning
Code for "Aligning Linguistic Words and Visual Semantic Units for Image Captioning", ACM MM 2019
This project helps researchers in computer vision automatically generate descriptive captions for images. It takes raw images and pre-extracted visual features as input, along with existing image-caption pairs, and produces human-readable sentences that accurately describe the image content. This is useful for scientists working on machine vision and natural language processing applications.
258 stars. No commits in the last 6 months.
Use this if you are a researcher focused on advancing image captioning models and want to build upon a system that aligns linguistic words with visual semantic units.
Not ideal if you need an out-of-the-box solution for generating image captions in a production environment or if you do not have access to GPU hardware and Python development experience.
Stars
258
Forks
24
Language
Python
License
MIT
Category
Last pushed
Oct 18, 2019
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/ltguo19/VSUA-Captioning"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ntrang086/image_captioning
generate captions for images using a CNN-RNN model that is trained on the Microsoft Common...
fregu856/CS224n_project
Neural Image Captioning in TensorFlow.
vacancy/SceneGraphParser
A python toolkit for parsing captions (in natural language) into scene graphs (as symbolic...
Abdelrhman-Yasser/video-content-description
Video content description model for generating descriptions for unconstrained videos
kozodoi/BMS_Molecular_Translation
Image-to-text translation of chemical molecule structures with deep learning (top-5% Kaggle solution)