uclaml/SPIN

The official implementation of Self-Play Fine-Tuning (SPIN)

/ 100

Emerging

This project helps machine learning engineers and researchers improve large language models (LLMs) without needing extensive new human-annotated data. You provide an existing LLM and a dataset of real user prompts and their desired responses, and it outputs a more capable LLM. This is ideal for those who want to enhance an LLM's performance beyond its initial supervised fine-tuning.

1,235 stars. No commits in the last 6 months.

Use this if you are a machine learning engineer or researcher looking to significantly boost the performance of an existing large language model by iteratively training it with self-generated data, reducing reliance on costly human preference labeling.

Not ideal if you need to train a large language model from scratch, or if you prefer traditional methods requiring large amounts of human-labeled preference data for alignment.

large-language-models LLM-fine-tuning AI-model-training natural-language-processing model-optimization

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 18 / 25

How are scores calculated?

Stars

1,235

Forks

104

Language

Python

License

Apache-2.0

Higher-rated alternatives

agentscope-ai/Trinity-RFT

Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement...

OpenRLHF/OpenRLHF

An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO &...

zjunlp/EasyEdit

[ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.

huggingface/alignment-handbook

Robust recipes to align language models with human and AI preferences

hyunwoongko/nanoRLHF

nanoRLHF: from-scratch journey into how LLMs and RLHF really work.

Explore Transformer Models

All categories Trending Transformer directory Insights