princeton-nlp/SimPO

[NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward

43
/ 100
Emerging

This project helps large language model (LLM) developers fine-tune their models to better align with human preferences. It takes a base LLM and a dataset of preferred and dispreferred responses, then outputs a refined LLM that generates more helpful and higher-quality text. Data scientists and machine learning engineers responsible for deploying and improving conversational AI or text generation systems will find this useful.

946 stars. No commits in the last 6 months.

Use this if you need to optimize a large language model to produce outputs that consistently match human preferences for quality and helpfulness, especially when a simpler, more efficient approach is desired.

Not ideal if you are looking for a pre-trained, ready-to-use LLM for general tasks without custom fine-tuning or if you lack the technical expertise to work with model training frameworks.

large-language-models conversational-ai-development model-fine-tuning natural-language-generation machine-learning-engineering
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 17 / 25

How are scores calculated?

Stars

946

Forks

73

Language

Python

License

MIT

Last pushed

Feb 16, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/princeton-nlp/SimPO"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.