Vision Agent Platforms AI Agents

Tools and frameworks for building production AI agents that process visual data from cameras, videos, or images in real-time. Includes multi-modal vision APIs, streaming video analysis, and visual perception systems. Does NOT include general computer vision libraries, image processing utilities, or non-agentic vision applications.

There are 36 vision agent platforms agents tracked. 1 score above 70 (verified tier). The highest-rated is GetStream/Vision-Agents at 73/100 with 7,366 stars. 1 of the top 10 are actively maintained.

Get all 36 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=agents&subcategory=vision-agent-platforms&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Agent Score Tier
1 GetStream/Vision-Agents

Open Vision Agents by Stream. Build Vision Agents quickly with any model or...

73
Verified
2 video-db/videodb-capture-quickstart

Give your agents real time desktop perception. Stream screen, microphone,...

41
Emerging
3 sijeeshmiziha/visionagent

Multi-provider AI agent framework with vision capabilities and tool calling....

36
Emerging
4 grctest/g3n-fastapi-webcam-docker

Utilizing multiple Gemma 3n agents to analyze webcam footage

35
Emerging
5 leukaemiamedtech/hias-tassai-facial-recognition

HIAS TassAI Facial Recognition Agent processes streams from local or remote...

33
Emerging
6 TheSethRose/AI-File-Organizer-Agent

Uses an AI agent (powered by Google Gemini via the Agno framework) to...

31
Emerging
7 Karmacoke/chargen

AI-powered character generator built with React. Create detailed TRPG/Novel...

30
Emerging
8 eric-ai-lab/Screen-Point-and-Read

Code repo for "Read Anywhere Pointed: Layout-aware GUI Screen Reading with...

27
Experimental
9 mohammad-oghli/Wildlife-Agentic-Vision

Google Gemini Agentic AI Vision for Wildlife Analytics

24
Experimental
10 Arshveen-singh/Vision-CLI

Please contact me at- Arshveensingh@proton.me

23
Experimental
11 jhhfut/eyefriend

Real-time AI vision assistant for visually impaired users.

22
Experimental
12 rupac4530-creator/vision-agent

Production-grade multi-modal AI platform — 17 real-time vision & audio tabs,...

22
Experimental
13 prodev717/vitaura

AI-powered civic issue management system that classifies citizen reports...

22
Experimental
14 Linda5823/Magic-Point-to-Read-V3

🪄 Magic Point-to-Read: An interactive AI reading assistant using Google...

21
Experimental
15 Eatosin/Structura

Turn Chaos Into Structure. A Type-Safe AI Agent that extracts valid JSON...

21
Experimental
16 Senju14/focus-bounty-ai

An autonomous AI Agent that uses Computer Vision and LLM reasoning to...

21
Experimental
17 Dewiin/blind-spot

CUNY Tech Prep 2025 Project

20
Experimental
18 nrbnayon/Stream-Lab

Stream Lab lets you watch movies online anytime, anywhere. Create your own...

20
Experimental
19 LatinScribe/siloam-public

Siloam helps visually impaired individuals gain real-time awareness of their...

20
Experimental
20 yusef1975/SortAI

SortAI: is a minimalist desktop automation tool designed for students and...

19
Experimental
21 mikhailusov/askGPT

AskGPT is a real-time AI assistant that enhances your calls, interviews,...

18
Experimental
22 imediacorp/file-organizer

Open-source AI-powered file organizer with scientific rigour. Built by...

17
Experimental
23 KazKozDev/vision-agent-analyst

Vision Agent Analyst is a professional web application for automatic...

17
Experimental
24 neurobot-ai/neurobot-vision

Train and validate computer vision models with integrated tools supporting...

16
Experimental
25 Chihuah/AgentsThinkWrite

使用 GPT 建立自訂角色、分工生成與合成審稿流程的範例。Example of GPT-driven role-based prompt...

15
Experimental
26 biswajit-debnath/IntelliJournal

Modern journal app powered by OpenAI GPT-4, featuring real-time analysis,...

14
Experimental
27 sarveshgupta89/pots_image_extractor

pots_image_extractor

13
Experimental
28 techySPHINX/AetheriaScribe

Unleash your imagination: Next.js streams dynamic tales crafted by Gemini AI...

13
Experimental
29 burgerman/vision_ai_insight

Vision AI-high-security and smart image analysis

13
Experimental
30 mirbasit01/ited

Build your own Generative AI App using Google Gemini API with React JS....

13
Experimental
31 Mahboob-A/drishti-ai

Eye Disease Detection Using Vision Agents | https://youtu.be/8LUT89UYnSc

12
Experimental
32 tarsislimadev/web-camera

The web camera that allows you to stream videos

12
Experimental
33 Valerian-AI/Twinship-Repository

Just a pet-project to test Google Gemini 2.5 Pro, Claude Sonnet 4, ChatGPT-5...

11
Experimental
34 burgerman/robotics_skills

Vision AI agent powered by VLM

11
Experimental
35 willytop8/Live-Environment-Streams

A master GeoJSON repository of 1,500+ live outdoor webcams globally....

11
Experimental
36 GPTBOTS/gptbots-excel-analysis-plugin

Leveraging AI API integration, it delivers functionality such as real-time...

10
Experimental