Fzkuji/GUI-Agent-Harness
🦞 Vision-based desktop automation skills for OpenClaw agents on macOS. See, learn, click — any app.
This tool helps macOS users automate repetitive desktop tasks by letting an AI agent "see" and interact with any application on the screen, just like a human. You provide a high-level instruction, and the system processes visual input from your screen to perform clicks, typing, and navigation across different apps. It's designed for anyone who needs to automate multi-step workflows that involve various desktop applications, without needing to write complex scripts or code.
Use this if you need to automate workflows across multiple macOS applications, such as managing emails, updating spreadsheets, or navigating web interfaces, with natural language commands.
Not ideal if your automation needs are limited to a single application with a well-defined API, or if you require cross-platform support beyond macOS.
Stars
21
Forks
1
Language
Python
License
MIT
Category
Last pushed
Apr 04, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/agents/Fzkuji/GUI-Agent-Harness"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
NakaokaRei/SwiftAutoGUI
A Swift library for macOS automation — mouse, keyboard, screenshots, image recognition, and...
dashn9/rustenium
Rust browser automation over the WebDriver BiDi Protocol and CDP
TideSurf/core
Connect Chromium to LLM agents via token-efficient DOM compression. 50-200 tokens per page.
abdo-Mansour/axetract
Low-Cost Cross-Domain Web Structured Information Extraction using specialized LoRA adapters.
BlackRoadOS/search
Sovereign search engine — search.blackroad.io