WooooDyy/AgentGym

Code and implementations for the ACL 2025 paper "AgentGym: Evolving Large Language Model-based Agents across Diverse Environments" by Zhiheng Xi et al.

/ 100

Emerging

AgentGym is a framework that allows AI researchers to develop and evaluate large language model-based agents across a wide range of tasks and environments. It takes in an LLM agent and provides standardized feedback from diverse environments like web browsing, text games, and digital tasks. The output is an evaluated agent, its performance metrics, and detailed interaction trajectories, helping researchers understand and improve agent behaviors. This is for AI researchers and practitioners focused on building capable, generalist LLM agents.

742 stars. No commits in the last 6 months.

Use this if you are developing or evaluating large language model agents and need a unified platform with diverse environments and real-time feedback to train and benchmark their capabilities.

Not ideal if you are looking for a pre-built, production-ready AI agent for a specific real-world application, as this is a research and development framework.

AI-agent-development LLM-evaluation reinforcement-learning interactive-AI generalist-AI

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 21 / 25

How are scores calculated?

Stars

742

Forks

108

Language

Python

License

MIT

Compare

AgentGym and AgentGym-RL

Higher-rated alternatives

ai4co/reevo

[NeurIPS 2024] ReEvo: Large Language Models as Hyper-Heuristics with Reflective Evolution

SALT-NLP/collaborative-gym

Framework and toolkits for building and evaluating collaborative agents that can work together...

Gen-Verse/LatentMAS

Latent Collaboration in Multi-Agent Systems

lean-dojo/LeanCopilot

LLMs as Copilots for Theorem Proving in Lean

WooooDyy/AgentGym-RL

Code and implementations for the paper "AgentGym-RL: Training LLM Agents for Long-Horizon...

Explore LLM Tools

All categories Trending LLM Tool directory Insights