reward-scope-ai/reward-scope

Real-time reward debugging and hacking detection for reinforcement learning

/ 100

Emerging

When training reinforcement learning agents, it often looks like the agent is improving because its reward score is rising, but its actual behavior is broken. This tool helps you catch these 'reward hacking' issues by monitoring your agent's training in real time. It takes your ongoing training data as input and provides a live dashboard and alerts, showing you exactly how your agent is learning and flagging problematic exploitation patterns. This is for machine learning researchers, engineers, and practitioners working with reinforcement learning models who need to ensure their agents learn desirable behaviors.

Use this if you are training reinforcement learning agents and need to detect when they are exploiting the reward function in unintended ways, rather than genuinely learning the desired task.

Not ideal if you are working with supervised or unsupervised learning models, as its features are specifically designed for reinforcement learning training analysis.

reinforcement-learning-training agent-behavior-analysis model-debugging machine-learning-operations AI-safety

No Package No Dependents

Maintenance 6 / 25

Adoption 6 / 25

Maturity 13 / 25

Community 9 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

DLR-RM/stable-baselines3

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

google-deepmind/dm_control

Google DeepMind's software stack for physics-based simulation and Reinforcement Learning...

Denys88/rl_games

RL implementations

pytorch/rl

A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.

yandexdataschool/Practical_RL

A course in reinforcement learning in the wild

Explore ML Frameworks

All categories Trending ML Framework directory Insights