SALMONN and video-SALMONN-2

These are ecosystem siblings where SALMONN is a foundational multi-modal LLM framework and video-SALMONN-2 is a specialized extension that applies the same architecture specifically to audio-visual video understanding tasks.

SALMONN

Established

video-SALMONN-2

Established

Maintenance 10/25

Adoption 10/25

Maturity 16/25

Community 18/25

Maintenance 10/25

Adoption 10/25

Maturity 15/25

Community 15/25

Stars: 1,392

Forks: 112

Downloads: —

Commits (30d): 0

Language: —

License: Apache-2.0

Stars: 167

Forks: 19

Downloads: —

Commits (30d): 0

Language: Python

License: Apache-2.0

No Package No Dependents

About SALMONN

bytedance/SALMONN

SALMONN family: A suite of advanced multi-modal LLMs

This project offers tools to build or use advanced AI models that can understand and generate text from various types of input, including audio and video. It helps with tasks like creating detailed captions for videos, answering questions about video content, or evaluating the quality of spoken audio. People who need to process and interpret complex multimedia data for tasks such as content analysis, media management, or accessibility will find this useful.

video-captioning audio-analysis multimedia-content-understanding speech-quality-assessment AI-development

About video-SALMONN-2

bytedance/video-SALMONN-2

video-SALMONN 2 is a powerful audio-visual large language model (LLM) that generates high-quality audio-visual video captions, which is developed by the Department of Electronic Engineering at Tsinghua University and ByteDance.

This project helps content creators, marketers, and educators by automatically generating high-quality captions for videos, taking into account both what is seen and heard. You provide video files, and it outputs detailed, accurate captions that enhance accessibility and understanding. It's designed for anyone needing to quickly and efficiently caption video content.

video-captioning content-accessibility media-production e-learning social-media-marketing

Scores updated daily from GitHub, PyPI, and npm data. How scores work