hud-evals/hud-python

OSS RL environment + evals toolkit

65
/ 100
Established

This platform helps AI researchers and developers build and test advanced AI agents. You can define custom environments where agents operate, equip them with specific tools like computer control or bash, and create scenarios to evaluate their performance. It takes your agent code and evaluation criteria, then outputs performance metrics and training results, allowing you to train models on those outcomes.

316 stars. Available on PyPI.

Use this if you need to systematically develop, evaluate, and train AI agents within controlled, repeatable environments.

Not ideal if you are looking for a pre-built, ready-to-use AI agent solution without needing to customize environments or run detailed evaluations.

AI agent development Reinforcement Learning Agent evaluation Model training AI research
Maintenance 10 / 25
Adoption 10 / 25
Maturity 25 / 25
Community 20 / 25

How are scores calculated?

Stars

316

Forks

52

Language

Python

License

MIT

Last pushed

Mar 13, 2026

Commits (30d)

0

Dependencies

16

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/hud-evals/hud-python"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.