GithubX-F/DynaMO-RL

Dynamic Rollout Allocation and Advantage Modulation for Policy Optimization (DynaMO) - Official Implementation

/ 100

Emerging

This project helps large language model (LLM) developers fine-tune their models for complex reasoning tasks, especially in mathematics. It takes your LLM and a set of reasoning problems, then dynamically optimizes the training process. The output is a more accurate and robust LLM capable of solving mathematical challenges with higher success rates.

Use this if you are developing or training large language models for tasks requiring verifiable reasoning, particularly in mathematical domains, and want to improve their performance and training efficiency.

Not ideal if you are working with LLMs for creative writing, summarization, or other non-reasoning-focused tasks, or if you are not directly involved in LLM training and optimization.

LLM training mathematical reasoning AI model optimization reinforcement learning for LLMs language model fine-tuning

No Package No Dependents

Maintenance 10 / 25

Adoption 9 / 25

Maturity 11 / 25

Community 4 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

agentscope-ai/Trinity-RFT

Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement...

OpenRLHF/OpenRLHF

An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO &...

zjunlp/EasyEdit

[ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.

huggingface/alignment-handbook

Robust recipes to align language models with human and AI preferences

hyunwoongko/nanoRLHF

nanoRLHF: from-scratch journey into how LLMs and RLHF really work.

Explore Transformer Models

All categories Trending Transformer directory Insights