RLHF Alignment Training LLM Tools

Tools and implementations for Reinforcement Learning from Human Feedback (RLHF), including reward modeling, policy optimization, and techniques for aligning LLMs with human preferences. Does NOT include general fine-tuning, inference optimization, or non-RLHF alignment methods.

There are 27 rlhf alignment training tools tracked. 5 score above 50 (established tier). The highest-rated is hud-evals/hud-python at 65/100 with 316 stars. 1 of the top 10 are actively maintained.

Get all 27 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=llm-tools&subcategory=rlhf-alignment-training&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 hud-evals/hud-python

OSS RL environment + evals toolkit

65
Established
2 hiyouga/EasyR1

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

58
Established
3 OpenRL-Lab/openrl

Unified Reinforcement Learning Framework

53
Established
4 sail-sg/oat

🌾 OAT: A research-friendly framework for LLM online alignment, including...

53
Established
5 opendilab/awesome-RLHF

A curated list of reinforcement learning with human feedback resources...

50
Established
6 NVlabs/GDPO

Official implementation of GDPO: Group reward-Decoupled Normalization Policy...

46
Emerging
7 xrsrke/instructGOOSE

Implementation of Reinforcement Learning from Human Feedback (RLHF)

41
Emerging
8 BaohaoLiao/SAGE

Self-Hinting Language Models Enhance Reinforcement Learning

40
Emerging
9 haoliuhl/chain-of-hindsight

Simple next-token-prediction for RLHF

39
Emerging
10 NJUNLP/GRRM

A novel Group Relative Reward Model (GRRM) framework enhances machine...

37
Emerging
11 arunprsh/ChatGPT-Decoded-GPT2-FAQ-Bot-RLHF-PPO

A Practical Guide to Developing a Reliable FAQ Chatbot with Reinforcement...

36
Emerging
12 SagnikMukherjee/sparsity_in_rl

Reinforcement Learning Finetunes Small Subnetworks in Large Language Models

33
Emerging
13 Jayluci4/micro-rlhf

RLHF in ~150 lines - understand how ChatGPT is aligned by building from scratch

32
Emerging
14 WisdomShell/RewardAnything

RewardAnything: Generalizable Principle-Following Reward Models

30
Emerging
15 rosinality/meshfn

Framework for Human Alignment Learning

29
Experimental
16 Zh1yuShen/MemBuilder

Code of "MemBuilder: Reinforcing LLMs for Long-Term Memory Construction via...

28
Experimental
17 zafstojano/policy-gradients

A minimal hackable implementation of policy gradient methods (GRPO, PPO, REINFORCE)

28
Experimental
18 hc495/StaICC

A standardized toolkit for classification task on In-context Learning....

27
Experimental
19 ALucek/rl-for-llms

Context & Guide For Reinforcement Learning with Verifiable Rewards with...

27
Experimental
20 AlignGPT-VL/AlignGPT

Official repo for "AlignGPT: Multi-modal Large Language Models with Adaptive...

27
Experimental
21 GAIR-NLP/ReAlign

Reformatted Alignment

26
Experimental
22 hggzjx/RewardAuditor

Official Repo for Paper: "Reward Auditor: Inference on Reward Modeling...

26
Experimental
23 psunlpgroup/FoVer

This repository includes code and materials for the paper "Generalizable...

22
Experimental
24 nielsyA/Tree-GRPO

🌳 Enhance LLM agent performance with Tree-GRPO, leveraging tree search...

22
Experimental
25 safouaneelg/SRT2I

Class-Conditional self-reward mechanism for improved Text-to-Image models

20
Experimental
26 lafmdp/RLC

[ICLR'24] Official code for "Language Model Self-improvement by...

14
Experimental
27 ikun-llm/ikun-GRPO

强化学习对齐 | Group Relative Policy Optimization 🎮

14
Experimental