shan18/Perceiver-Resampler-XAttn-Captioning

Generating Captions via Perceiver-Resampler Cross-Attention Networks

/ 100

Emerging

This project helps generate natural language descriptions for images and videos. You input an image or video, and it outputs a textual caption describing its content. This tool is useful for researchers and developers working on content accessibility, automated content moderation, or large-scale media analysis.

No commits in the last 6 months.

Use this if you need to automatically generate descriptive text for visual content, such as images or short videos, using a powerful pre-trained model.

Not ideal if you require real-time captioning on live video feeds or need to fine-tune a model on a very small, highly specialized dataset without transfer learning.

image-captioning video-description content-accessibility media-analysis natural-language-generation

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 6 / 25

Maturity 16 / 25

Community 15 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

ntrang086/image_captioning

generate captions for images using a CNN-RNN model that is trained on the Microsoft Common...

fregu856/CS224n_project

Neural Image Captioning in TensorFlow.

vacancy/SceneGraphParser

A python toolkit for parsing captions (in natural language) into scene graphs (as symbolic...

ltguo19/VSUA-Captioning

Code for "Aligning Linguistic Words and Visual Semantic Units for Image Captioning", ACM MM 2019

Abdelrhman-Yasser/video-content-description

Video content description model for generating descriptions for unconstrained videos

Explore NLP Tools

All categories Trending NLP directory Insights