InfiXAI/InfiGUI-G1

[AAAI 2026 Oral] Official repository for InfiGUI-G1. We introduce Adaptive Exploration Policy Optimization (AEPO) to overcome semantic alignment bottlenecks in GUI agents through efficient, guided exploration.

/ 100

Emerging

InfiGUI-G1 improves how AI agents understand and interact with graphical user interfaces (GUIs). It takes natural language instructions and a GUI screenshot, then identifies the correct UI element to interact with, even for complex or subtly described actions. This is for AI researchers and developers building sophisticated GUI automation or AI assistants.

137 stars.

Use this if you are developing AI agents that need to reliably interpret natural language commands and pinpoint the correct elements on any graphical interface, like a website or desktop application.

Not ideal if you are looking for an off-the-shelf end-user application for GUI automation, as this project provides the underlying models and training framework.

AI agent development GUI automation human-computer interaction research multimodal AI large language models

No Package No Dependents

Maintenance 6 / 25

Adoption 10 / 25

Maturity 15 / 25

Community 13 / 25

How are scores calculated?

Stars

137

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

AMA-CMFAI/LAMBDA

This is the offical repository of paper "LAMBDA: A large Model Based Data Agent"....

zjunlp/LLMAgentPapers

Must-read Papers on LLM Agents.

hyp1231/awesome-llm-powered-agent

Awesome things about LLM-powered agents. Papers / Repos / Blogs / ...

MineDojo/Voyager

An Open-Ended Embodied Agent with Large Language Models

wadeKeith/Awesome-Embodied-AI

An Introduction to Embodied Intelligence (A Quick Guide of Embodied-AI) (Updating)

Explore LLM Tools

All categories Trending LLM Tool directory Insights