niuzaisheng/ScreenAgent

ScreenAgent: A Computer Control Agent Driven by Visual Language Large Model (IJCAI-24)

43
/ 100
Emerging

This project offers a way to automate complex computer tasks using an AI agent that 'sees' your screen and 'acts' like a human user. It takes your high-level instructions and translates them into mouse clicks and keyboard inputs, allowing the AI to interact with any desktop application or operating system. It's designed for anyone who needs to automate repetitive, multi-step digital workflows that typically require manual interaction with a graphical user interface.

579 stars. No commits in the last 6 months.

Use this if you need to automate tasks on a computer's graphical interface that current scripting or RPA tools can't handle, such as those requiring visual understanding or adaptive decision-making.

Not ideal if your tasks are primarily text-based, involve simple data entry, or can be accomplished through existing APIs or backend automations.

desktop-automation workflow-automation human-computer-interaction AI-assisted-operations
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 17 / 25

How are scores calculated?

Stars

579

Forks

62

Language

Python

License

Last pushed

Nov 25, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/agents/niuzaisheng/ScreenAgent"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.