GetStream/Vision-Agents
Open Vision Agents by Stream. Build Vision Agents quickly with any model or video provider. Uses Stream's edge network for ultra-low latency.
This project helps you build AI assistants that can watch and understand live video and audio, responding in real-time. It takes live video and audio feeds and combines them with advanced AI models to produce intelligent insights or interactions, like real-time coaching or anomaly detection. This is for developers building interactive AI applications that require immediate understanding and response to visual and auditory cues.
7,366 stars. Actively maintained with 46 commits in the last 30 days. Available on PyPI.
Use this if you need to create multi-modal AI agents that can watch, listen, and understand live video streams with ultra-low latency for applications like sports coaching, drone monitoring, or virtual assistants.
Not ideal if your application does not require real-time video and audio processing or if you're not comfortable integrating various AI models and services.
Stars
7,366
Forks
574
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 13, 2026
Commits (30d)
46
Dependencies
10
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/agents/GetStream/Vision-Agents"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Recent Releases
Related agents
video-db/videodb-capture-quickstart
Give your agents real time desktop perception. Stream screen, microphone, and system audio for...
sijeeshmiziha/visionagent
Multi-provider AI agent framework with vision capabilities and tool calling. Supports OpenAI,...
grctest/g3n-fastapi-webcam-docker
Utilizing multiple Gemma 3n agents to analyze webcam footage
leukaemiamedtech/hias-tassai-facial-recognition
HIAS TassAI Facial Recognition Agent processes streams from local or remote cameras to identify...
TheSethRose/AI-File-Organizer-Agent
Uses an AI agent (powered by Google Gemini via the Agno framework) to intelligently propose and...