RLHF Alignment Training LLM Tools
Tools and implementations for Reinforcement Learning from Human Feedback (RLHF), including reward modeling, policy optimization, and techniques for aligning LLMs with human preferences. Does NOT include general fine-tuning, inference optimization, or non-RLHF alignment methods.
There are 27 rlhf alignment training tools tracked. 5 score above 50 (established tier). The highest-rated is hud-evals/hud-python at 65/100 with 316 stars. 1 of the top 10 are actively maintained.
Get all 27 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=llm-tools&subcategory=rlhf-alignment-training&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
hud-evals/hud-python
OSS RL environment + evals toolkit |
|
Established |
| 2 |
hiyouga/EasyR1
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL |
|
Established |
| 3 |
OpenRL-Lab/openrl
Unified Reinforcement Learning Framework |
|
Established |
| 4 |
sail-sg/oat
🌾 OAT: A research-friendly framework for LLM online alignment, including... |
|
Established |
| 5 |
opendilab/awesome-RLHF
A curated list of reinforcement learning with human feedback resources... |
|
Established |
| 6 |
NVlabs/GDPO
Official implementation of GDPO: Group reward-Decoupled Normalization Policy... |
|
Emerging |
| 7 |
xrsrke/instructGOOSE
Implementation of Reinforcement Learning from Human Feedback (RLHF) |
|
Emerging |
| 8 |
BaohaoLiao/SAGE
Self-Hinting Language Models Enhance Reinforcement Learning |
|
Emerging |
| 9 |
haoliuhl/chain-of-hindsight
Simple next-token-prediction for RLHF |
|
Emerging |
| 10 |
NJUNLP/GRRM
A novel Group Relative Reward Model (GRRM) framework enhances machine... |
|
Emerging |
| 11 |
arunprsh/ChatGPT-Decoded-GPT2-FAQ-Bot-RLHF-PPO
A Practical Guide to Developing a Reliable FAQ Chatbot with Reinforcement... |
|
Emerging |
| 12 |
SagnikMukherjee/sparsity_in_rl
Reinforcement Learning Finetunes Small Subnetworks in Large Language Models |
|
Emerging |
| 13 |
Jayluci4/micro-rlhf
RLHF in ~150 lines - understand how ChatGPT is aligned by building from scratch |
|
Emerging |
| 14 |
WisdomShell/RewardAnything
RewardAnything: Generalizable Principle-Following Reward Models |
|
Emerging |
| 15 |
rosinality/meshfn
Framework for Human Alignment Learning |
|
Experimental |
| 16 |
Zh1yuShen/MemBuilder
Code of "MemBuilder: Reinforcing LLMs for Long-Term Memory Construction via... |
|
Experimental |
| 17 |
zafstojano/policy-gradients
A minimal hackable implementation of policy gradient methods (GRPO, PPO, REINFORCE) |
|
Experimental |
| 18 |
hc495/StaICC
A standardized toolkit for classification task on In-context Learning.... |
|
Experimental |
| 19 |
ALucek/rl-for-llms
Context & Guide For Reinforcement Learning with Verifiable Rewards with... |
|
Experimental |
| 20 |
AlignGPT-VL/AlignGPT
Official repo for "AlignGPT: Multi-modal Large Language Models with Adaptive... |
|
Experimental |
| 21 |
GAIR-NLP/ReAlign
Reformatted Alignment |
|
Experimental |
| 22 |
hggzjx/RewardAuditor
Official Repo for Paper: "Reward Auditor: Inference on Reward Modeling... |
|
Experimental |
| 23 |
psunlpgroup/FoVer
This repository includes code and materials for the paper "Generalizable... |
|
Experimental |
| 24 |
nielsyA/Tree-GRPO
🌳 Enhance LLM agent performance with Tree-GRPO, leveraging tree search... |
|
Experimental |
| 25 |
safouaneelg/SRT2I
Class-Conditional self-reward mechanism for improved Text-to-Image models |
|
Experimental |
| 26 |
lafmdp/RLC
[ICLR'24] Official code for "Language Model Self-improvement by... |
|
Experimental |
| 27 |
ikun-llm/ikun-GRPO
强化学习对齐 | Group Relative Policy Optimization 🎮 |
|
Experimental |