li-plus/flash-preference

Accelerate LLM preference tuning via prefix sharing with a single line of code

/ 100

Experimental

This tool helps machine learning engineers accelerate the process of fine-tuning large language models (LLMs) based on human preferences. By efficiently sharing common parts of input sequences, it speeds up both forward and backward passes during training. This is ideal for ML engineers who are working with techniques like Direct Preference Optimization (DPO) or Reward Modeling.

No commits in the last 6 months.

Use this if you are an ML engineer training LLMs with preference data and want to significantly reduce computation time and memory usage without compromising model accuracy.

Not ideal if you are not directly involved in training or fine-tuning large language models, or if your tasks do not involve preference-based learning.

LLM-fine-tuning Reinforcement-Learning-from-Human-Feedback model-optimization deep-learning-training

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 8 / 25

Maturity 16 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

MIT

Higher-rated alternatives

stair-lab/mlhp

Machine Learning from Human Preferences

princeton-nlp/SimPO

[NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward

uclaml/SPPO

The official implementation of Self-Play Preference Optimization (SPPO)

general-preference/general-preference-model

[ICML 2025] Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment...

sail-sg/dice

Official implementation of Bootstrapping Language Models via DPO Implicit Rewards

Explore Transformer Models

All categories Trending Transformer directory Insights