InternLM/Spark

An official implementation of "SPARK: Synergistic Policy And Reward Co-Evolving Framework"

/ 100

Experimental

SPARK helps AI researchers and developers fine-tune large language models more efficiently. It takes in existing language models and automatically generates both the policy (how the model behaves) and the reward signals (how well it performs) from verifiable data, leading to a self-learning and self-evolving system. The output is a more capable, specialized language model that can perform better on various tasks.

Use this if you are a researcher or AI engineer looking to enhance large language models by integrating policy and reward mechanisms within a single model for joint training, without needing human preference data or external reward models.

Not ideal if you are a general user looking for an off-the-shelf AI tool for everyday tasks, as this is a framework for advanced model training and development.

AI model fine-tuning Reinforcement learning Large language model development Machine learning research Model optimization

No Package No Dependents

Maintenance 6 / 25

Adoption 7 / 25

Maturity 15 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

Apache-2.0

Higher-rated alternatives

agentscope-ai/Trinity-RFT

Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement...

OpenRLHF/OpenRLHF

An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO &...

zjunlp/EasyEdit

[ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.

huggingface/alignment-handbook

Robust recipes to align language models with human and AI preferences

hyunwoongko/nanoRLHF

nanoRLHF: from-scratch journey into how LLMs and RLHF really work.

Explore Transformer Models

All categories Trending Transformer directory Insights