Diffusion RLHF Alignment Diffusion Models

Tools and methods for aligning diffusion models using reinforcement learning and human feedback, including preference optimization, reward modeling, and RLHF fine-tuning techniques. Does NOT include general diffusion model training, inference optimization, or non-RL-based fine-tuning methods like LoRA.

There are 57 diffusion rlhf alignment models tracked. 3 score above 50 (established tier). The highest-rated is FlorianFuerrutter/genQC at 61/100 with 57 stars.

Get all 57 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=diffusion&subcategory=diffusion-rlhf-alignment&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Model Score Tier
1 FlorianFuerrutter/genQC

Generative Quantum Circuits

61
Established
2 horseee/DeepCache

[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free

53
Established
3 Gen-Verse/MMaDA

MMaDA - Open-Sourced Multimodal Large Diffusion Language Models (dLLMs with...

51
Established
4 kuleshov-group/mdlm

[NeurIPS 2024] Simple and Effective Masked Diffusion Language Model

49
Emerging
5 Shark-NLP/DiffuSeq

[ICLR'23] DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models

47
Emerging
6 jeongwhanchoi/SCONE

"SCONE: A Novel Stochastic Sampling to Generate Contrastive Views and Hard...

43
Emerging
7 ali-vilab/TeaCache

Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model

42
Emerging
8 Hzfinfdu/Diffusion-BERT

ACL'2023: DiffusionBERT: Improving Generative Masked Language Models with...

40
Emerging
9 Xiuyu-Li/q-diffusion

[ICCV 2023] Q-Diffusion: Quantizing Diffusion Models.

40
Emerging
10 yk7333/d3po

[CVPR 2024] Code for the paper "Using Human Feedback to Fine-tune Diffusion...

39
Emerging
11 czg1225/AsyncDiff

[NeurIPS 2024] AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising

39
Emerging
12 mapo-t2i/mapo

Official codebase for Margin-aware Preference Optimization for Aligning...

37
Emerging
13 MiZhenxing/ThinkDiff

ICML2025, I Think, Therefore I Diffuse: Enabling Multimodal In-Context...

37
Emerging
14 HKUDS/DiffKG

[WSDM'2024 Oral] "DiffKG: Knowledge Graph Diffusion Model for Recommendation"

37
Emerging
15 HKUDS/RecDiff

[CIKM'2024] "RecDiff: Diffusion Model for Social Recommendation"

36
Emerging
16 yjyddq/EOSER-ASS-RL

Official Repository of "Taming Masked Diffusion Language Models via...

36
Emerging
17 mihirp1998/AlignProp

AlignProp uses direct reward backpropogation for the alignment of...

35
Emerging
18 InternScience/AdaptiveDiffusion

[NeurIPS'24] Training-Free Adaptive Diffusion with Bounded Difference...

35
Emerging
19 basiclab/DiffusionDRO

[NeurIPS 2025] Ranking-based Preference Optimization for Diffusion Models...

34
Emerging
20 Ting-Justin-Jiang/sada-icml

[ICML 2025] Official Repo for Stability-guided Adaptive Diffusion...

34
Emerging
21 keshik6/grafting

[NeurIPS 2025 Oral] Official Code for Exploring Diffusion Transformer...

34
Emerging
22 aaron-di/CDM-PSL

[AAAI 2025] CDM-PSL: Expensive Multi-Objective Bayesian Optimization Based...

33
Emerging
23 H-EmbodVis/EasyCache

Less is Enough: Training-Free Video Diffusion Acceleration via...

33
Emerging
24 hu-zijing/B2-DiffuRL

[CVPR 25] A framework named B^2-DiffuRL for RL-based diffusion model fine-tuning.

33
Emerging
25 masa-ue/SVDD

Derivative-Free Guidance in Diffusion Models with Soft Value-Based Decoding....

33
Emerging
26 AniAggarwal/ecad

[ICLR 2026] Code for Evolutionary Caching to Accelerate Your Off-the-Shelf...

33
Emerging
27 ZiyiZhang27/sdpo

[IEEE TPAMI] Code for the paper "Aligning Few-Step Diffusion Models with...

32
Emerging
28 BIT-DA/DUSA

[NeurIPS 2024] Exploring Structured Semantic Priors Underlying Diffusion...

30
Emerging
29 ilog-ecnu/CDM-PSL

[AAAI 2025] CDM-PSL: Expensive Multi-Objective Bayesian Optimization Based...

30
Emerging
30 AIDC-AI/Diffusion-SDPO

Diffusion-SDPO: Safeguarded Direct Preference Optimization for Diffusion Models

29
Experimental
31 L-YeZhu/BoundaryDiffusion

[NeurIPS2023] BoundaryDiffusion: A learning-free method for semantic control...

29
Experimental
32 hu-zijing/D-Fusion

[ICML 25] Denoising trajectory fusion, a method to construct RL-trainable...

29
Experimental
33 user683/CausalDiffRec

[WWW'25]The official implementation of Graph Representation Learning via...

29
Experimental
34 ModelTC/HarmoniCa

[ICML 2025] This is the official PyTorch implementation of "🎵 HarmoniCa:...

28
Experimental
35 masa-ue/RLfinetuning_Diffusion_Bioseq

Code for the tutorial/review paper for RL-based-fine-tuniing. In this code,...

28
Experimental
36 OptiSys-ZJU/segquant

[CVPR '26] A Semantics-Aware and Generalizable Quantization Framework for...

28
Experimental
37 akashsonowal/ddpo-pytorch

RLHF for Stable Diffusion

27
Experimental
38 zihaowu25/InvarDiff

InvarDiff: Cross-Scale Invariance Caching for Accelerated Diffusion Models

26
Experimental
39 ZiyiZhang27/tdpo

[ICML 2024] Code for the paper "Confronting Reward Overoptimization for...

26
Experimental
40 yjyddq/rho-EOS

Official Repository of "ρ-𝙴𝙾𝚂: Training-free Bidirectional Variable-Length...

25
Experimental
41 itsluckysharma01/RL-based_Adaptive_Game_Difficulty_Engine

This repository contains an implementation of a 🏗️Reinforcement Learning...

25
Experimental
42 HKUDS/DiffMM

[ACM MM'2024]"DiffMM: Multi-Modal Diffusion Model for Recommendation"

25
Experimental
43 horseee/learning-to-cache

[NeurIPS 2024] Learning-to-Cache: Accelerating Diffusion Transformer via...

23
Experimental
44 UCF-CRCV/core

Context-Robust Remasking for Diffusion Language Models

23
Experimental
45 zhaoyl18/SEIKO

SEIKO is a novel reinforcement learning method to efficiently fine-tune...

23
Experimental
46 suinleelab/An-Efficient-Framework-for-Crediting-Data-Contributors-of-Diffusion-Models

[ICLR2025] An Efficient Framework for Crediting Data Contributors of Diffusion Models

23
Experimental
47 sahsaeedi/DCPO-T2I

[TMLR] Dual Caption Preference Optimization

21
Experimental
48 LemonTwoL/ReNeg

ReNeg: Learning Negative Embedding with Reward Guidance

21
Experimental
49 Yeez-lee/Data-Selection-and-Reweighting-for-Diffusion-Models

[ICASSP 25'] Pruning then Reweighting: Towards Data-Efficient Training of...

20
Experimental
50 THU-AccDiff/xslim

Official implementation of X-Slim(xslim): Accelerating diffusion model via...

19
Experimental
51 federicobrancasi/quantdiff-paper

Research paper: QuantDiff - Efficient Mixed-Precision Quantization for...

19
Experimental
52 arthur-x/AlmostPerfect

Simple end-to-end RLHF (Reinforcement Learning from Human Feedback) for...

17
Experimental
53 jinluo12345/Reinforcement-learning-guidance

RLG: Inference-Time Alignment Control for Diffusion Models with...

14
Experimental
54 IDavron/RWR

Implementation of Reward Weighted Regression method proposed in "Training...

11
Experimental
55 LIUTIGHE/HetCache

[CVPR'26] The official implementation of paper "Accelerating Diffusion-based...

11
Experimental
56 KJLdefeated/MODDPO

2024 NYCU DLP Final Project: Training Diffusion Model with Multi-Objective...

10
Experimental
57 Alsace08/RetroDiff

[AISTATS 2025] Code and Data Repo for Paper "RetroDiff: Retrosynthesis as...

10
Experimental