MarcoMeter/recurrent-ppo-truncated-bptt
Baseline implementation of recurrent PPO using truncated BPTT
This tool helps developers and researchers implement and experiment with Recurrent Proximal Policy Optimization (PPO) using truncated Backpropagation Through Time (BPTT). It takes in environmental observations from various simulation tasks, such as Minigrid or CartPole, and outputs trained agents that can perform actions within these environments. It is designed for those building reinforcement learning agents that need to process sequential information to make decisions.
160 stars. No commits in the last 6 months.
Use this if you are a reinforcement learning practitioner looking for a robust and clear PyTorch baseline to build memory-aware agents for partially observable environments.
Not ideal if you are looking for a plug-and-play solution for a real-world application without prior experience in reinforcement learning or deep learning frameworks.
Stars
160
Forks
20
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Apr 28, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/MarcoMeter/recurrent-ppo-truncated-bptt"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
DLR-RM/stable-baselines3
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
google-deepmind/dm_control
Google DeepMind's software stack for physics-based simulation and Reinforcement Learning...
Denys88/rl_games
RL implementations
pytorch/rl
A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.
yandexdataschool/Practical_RL
A course in reinforcement learning in the wild