saharmor/gemini-multimodal-playground

Build realtime voice and video agents with Google's new Gemini 2.0 (API is free for now)

/ 100

Established

This tool allows you to have real-time spoken conversations with an AI assistant that can also understand what it sees from your webcam or screen. You provide voice, video, or screen sharing as input, and the AI responds with spoken audio. This is ideal for anyone who wants to interact with an AI naturally, as if they were talking to another person.

323 stars. No commits in the last 6 months.

Use this if you need an AI that can understand both spoken language and visual information from your camera or screen, and respond to you audibly in real-time.

Not ideal if you're looking for a simple text-based AI chat or a tool that doesn't require live multimedia interaction.

AI assistant real-time communication multimedia interaction voice interface screen sharing

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 23 / 25

How are scores calculated?

Stars

323

Forks

Language

TypeScript

License

Apache-2.0

Related tools

HanaokaYuzu/Gemini-API

✨ Reverse-engineered Python API for Google Gemini web app

faetalize/zodiac

A ⚡lightweight⚡ frontend for Google's Gemini Pro.

hihumanzone/Gemini-Discord-Bot

A Discord bot leveraging Google Gemini. Has image/video/audio recognition, conversation...

Amm1rr/WebAI-to-API

Gemini to API (Don't need API KEY) (ChatGPT, Claude, DeeepSeek, Grok and more)

GewoonJaap/gemini-cli-openai

Expose Gemini CLI endpoints as OpenAI API with Cloudflare Workers

Explore LLM Tools

All categories Trending LLM Tool directory Insights