bytedance/UI-TARS-desktop

The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra

59
/ 100
Established

This project provides a multimodal AI agent that helps automate complex tasks across your computer, browser, and other applications. You provide instructions in natural language, and the agent uses its vision and GUI control capabilities to interact with various tools and interfaces, completing the workflow. It's designed for anyone who needs to automate repetitive or multi-step digital processes.

28,739 stars. Actively maintained with 1 commit in the last 30 days.

Use this if you need an AI to autonomously perform tasks that involve interacting with graphical user interfaces, such as booking flights or generating reports by controlling desktop applications and web browsers.

Not ideal if your tasks are purely data-processing without GUI interaction or if you require an agent that operates solely within a terminal environment without visual context.

task-automation digital-assistant workflow-automation browser-automation desktop-automation
No Package No Dependents
Maintenance 13 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 20 / 25

How are scores calculated?

Stars

28,739

Forks

2,814

Language

TypeScript

License

Apache-2.0

Last pushed

Mar 10, 2026

Commits (30d)

1

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/agents/bytedance/UI-TARS-desktop"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.