bytedance/UI-TARS-desktop
The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra
This project provides a multimodal AI agent that helps automate complex tasks across your computer, browser, and other applications. You provide instructions in natural language, and the agent uses its vision and GUI control capabilities to interact with various tools and interfaces, completing the workflow. It's designed for anyone who needs to automate repetitive or multi-step digital processes.
28,739 stars. Actively maintained with 1 commit in the last 30 days.
Use this if you need an AI to autonomously perform tasks that involve interacting with graphical user interfaces, such as booking flights or generating reports by controlling desktop applications and web browsers.
Not ideal if your tasks are purely data-processing without GUI interaction or if you require an agent that operates solely within a terminal environment without visual context.
Stars
28,739
Forks
2,814
Language
TypeScript
License
Apache-2.0
Category
Last pushed
Mar 10, 2026
Commits (30d)
1
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/agents/bytedance/UI-TARS-desktop"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Recent Releases
Related agents
reasonkit/reasonkit
"From Prompt to Cognitive Engineering". — AI: Designed, not Dreamed.
ekingunoncu/izan.io
Turn Any Browser Action & Data Extraction into an AI Tool in 60 Seconds
anilreddyavula/FormPilot
🔧 Automate web form submissions with FormPilot, easily using markdown files while keeping your...
lespaceman/agent-web-interface
A unified perception and interaction interface that enables AI agents to use the web efficiently
ShivamGoyal03/FormPilot
Automation Tool To Fill out Form