ducnh279/Align-LLMs-with-DPO

Align a Large Language Model (LLM) with DPO loss

20
/ 100
Experimental

This project helps machine learning engineers or researchers fine-tune Large Language Models (LLMs) to better align with human preferences. You provide an LLM and a dataset of preferred and dispreferred responses, and it outputs a fine-tuned LLM that generates more desirable text. This is for professionals working on improving the behavior and safety of AI models.

No commits in the last 6 months.

Use this if you are a machine learning engineer or researcher looking to apply Direct Preference Optimization (DPO) to align your LLM using a custom dataset of human preferences.

Not ideal if you are an end-user without a technical background in machine learning and model training, as this is a developer-focused tool for LLM alignment.

LLM-fine-tuning AI-model-alignment natural-language-processing deep-learning-research
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 4 / 25
Maturity 8 / 25
Community 8 / 25

How are scores calculated?

Stars

8

Forks

1

Language

Jupyter Notebook

License

Last pushed

Jun 06, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/ducnh279/Align-LLMs-with-DPO"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.