IDEA-CCNL/Real-Gemini

Real-time video understanding and interaction through text,audio,image and video with large multi-modal model. 利用多模态大模型的实时视频理解和交互框架,通过文本、语音、图像和视频和这是世界进行问答和交流。

39
/ 100
Emerging

This project helps you build interactive applications that can understand and respond to real-time video content. It takes live video streams, along with your text or audio input, and provides intelligent answers or generates new content like music. This is designed for application developers who want to integrate advanced multi-modal AI into their projects.

No commits in the last 6 months.

Use this if you are a developer building real-time applications that require understanding live video and interacting with it using various modalities like text, audio, and images.

Not ideal if you are an end-user looking for a ready-to-use application, as this project provides a framework and requires technical setup.

real-time-video-processing interactive-AI-development multi-modal-applications live-system-integration
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 7 / 25
Maturity 16 / 25
Community 16 / 25

How are scores calculated?

Stars

26

Forks

6

Language

Python

License

Apache-2.0

Last pushed

Jan 26, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/IDEA-CCNL/Real-Gemini"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.