spiral-rl/spiral

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

/ 100

Emerging

SPIRAL helps AI researchers and developers create more intelligent language models without needing extensive human-curated data or complex reward engineering. It trains models by having them play multi-turn, zero-sum games against themselves, generating an endless supply of progressively challenging problems. The output is a language model that has developed advanced reasoning strategies and performs better on various math and general reasoning benchmarks.

177 stars. No commits in the last 6 months.

Use this if you are a machine learning researcher or engineer looking to train powerful reasoning capabilities into large language models through autonomous self-play in competitive text-based games.

Not ideal if you need a pre-trained model for immediate deployment or if you are not comfortable with advanced reinforcement learning and distributed training setups.

AI model training Reinforcement learning Natural language processing Generative AI Game AI

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 10 / 25

Maturity 15 / 25

Community 15 / 25

How are scores calculated?

Stars

177

Forks

Language

Python

License

MIT

Higher-rated alternatives

ai4co/reevo

[NeurIPS 2024] ReEvo: Large Language Models as Hyper-Heuristics with Reflective Evolution

SALT-NLP/collaborative-gym

Framework and toolkits for building and evaluating collaborative agents that can work together...

Gen-Verse/LatentMAS

Latent Collaboration in Multi-Agent Systems

lean-dojo/LeanCopilot

LLMs as Copilots for Theorem Proving in Lean

WooooDyy/AgentGym-RL

Code and implementations for the paper "AgentGym-RL: Training LLM Agents for Long-Horizon...

Explore LLM Tools

All categories Trending LLM Tool directory Insights