GithubX-F/DynaMO-RL
Dynamic Rollout Allocation and Advantage Modulation for Policy Optimization (DynaMO) - Official Implementation
This project helps large language model (LLM) developers fine-tune their models for complex reasoning tasks, especially in mathematics. It takes your LLM and a set of reasoning problems, then dynamically optimizes the training process. The output is a more accurate and robust LLM capable of solving mathematical challenges with higher success rates.
Use this if you are developing or training large language models for tasks requiring verifiable reasoning, particularly in mathematical domains, and want to improve their performance and training efficiency.
Not ideal if you are working with LLMs for creative writing, summarization, or other non-reasoning-focused tasks, or if you are not directly involved in LLM training and optimization.
Stars
86
Forks
2
Language
Python
License
Apache-2.0
Category
Last pushed
Mar 10, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/GithubX-F/DynaMO-RL"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
agentscope-ai/Trinity-RFT
Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement...
OpenRLHF/OpenRLHF
An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO &...
zjunlp/EasyEdit
[ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.
huggingface/alignment-handbook
Robust recipes to align language models with human and AI preferences
hyunwoongko/nanoRLHF
nanoRLHF: from-scratch journey into how LLMs and RLHF really work.