saharmor/gemini-multimodal-playground

Build realtime voice and video agents with Google's new Gemini 2.0 (API is free for now)

51
/ 100
Established

This tool allows you to have real-time spoken conversations with an AI assistant that can also understand what it sees from your webcam or screen. You provide voice, video, or screen sharing as input, and the AI responds with spoken audio. This is ideal for anyone who wants to interact with an AI naturally, as if they were talking to another person.

323 stars. No commits in the last 6 months.

Use this if you need an AI that can understand both spoken language and visual information from your camera or screen, and respond to you audibly in real-time.

Not ideal if you're looking for a simple text-based AI chat or a tool that doesn't require live multimedia interaction.

AI assistant real-time communication multimedia interaction voice interface screen sharing
Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 23 / 25

How are scores calculated?

Stars

323

Forks

71

Language

TypeScript

License

Apache-2.0

Last pushed

Sep 12, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/saharmor/gemini-multimodal-playground"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.