shan18/Perceiver-Resampler-XAttn-Captioning

Generating Captions via Perceiver-Resampler Cross-Attention Networks

37
/ 100
Emerging

This project helps generate natural language descriptions for images and videos. You input an image or video, and it outputs a textual caption describing its content. This tool is useful for researchers and developers working on content accessibility, automated content moderation, or large-scale media analysis.

No commits in the last 6 months.

Use this if you need to automatically generate descriptive text for visual content, such as images or short videos, using a powerful pre-trained model.

Not ideal if you require real-time captioning on live video feeds or need to fine-tune a model on a very small, highly specialized dataset without transfer learning.

image-captioning video-description content-accessibility media-analysis natural-language-generation
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 6 / 25
Maturity 16 / 25
Community 15 / 25

How are scores calculated?

Stars

17

Forks

4

Language

Python

License

MIT

Category

image-captioning

Last pushed

Dec 20, 2022

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/shan18/Perceiver-Resampler-XAttn-Captioning"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.