hud-evals/hud-python
OSS RL environment + evals toolkit
This platform helps AI researchers and developers build and test advanced AI agents. You can define custom environments where agents operate, equip them with specific tools like computer control or bash, and create scenarios to evaluate their performance. It takes your agent code and evaluation criteria, then outputs performance metrics and training results, allowing you to train models on those outcomes.
316 stars. Available on PyPI.
Use this if you need to systematically develop, evaluate, and train AI agents within controlled, repeatable environments.
Not ideal if you are looking for a pre-built, ready-to-use AI agent solution without needing to customize environments or run detailed evaluations.
Stars
316
Forks
52
Language
Python
License
MIT
Category
Last pushed
Mar 13, 2026
Commits (30d)
0
Dependencies
16
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/hud-evals/hud-python"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
hiyouga/EasyR1
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
OpenRL-Lab/openrl
Unified Reinforcement Learning Framework
sail-sg/oat
🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning,...
opendilab/awesome-RLHF
A curated list of reinforcement learning with human feedback resources (continually updated)
NVlabs/GDPO
Official implementation of GDPO: Group reward-Decoupled Normalization Policy Optimization for...