thinkwee/NOVER
[EMNLP-2025] R1-Zero on ANY TASK
This project helps AI engineers and researchers improve how their language models reason and generate text, particularly for complex tasks beyond just math or coding. You provide a standard dataset with prompts and expected answers, and the project trains your language model to produce better, more logical responses without needing a separate 'checker' tool. The end result is a more capable language model that can handle a wider variety of reasoning-intensive text-to-text tasks.
Use this if you need to train or fine-tune a language model to exhibit stronger reasoning capabilities across diverse text-based tasks, using only your existing supervised fine-tuning data.
Not ideal if you are looking for a pre-trained model to use directly, rather than a framework for training your own custom language models.
Stars
28
Forks
1
Language
Python
License
—
Category
Last pushed
Nov 09, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/thinkwee/NOVER"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
agentscope-ai/Trinity-RFT
Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement...
OpenRLHF/OpenRLHF
An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO &...
zjunlp/EasyEdit
[ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.
huggingface/alignment-handbook
Robust recipes to align language models with human and AI preferences
hyunwoongko/nanoRLHF
nanoRLHF: from-scratch journey into how LLMs and RLHF really work.