anakin87/qwen-scheduler-grpo
Train a Language Model with GRPO to create a schedule from a list of events and priorities
This project explores teaching a language model to create schedules. You provide a list of events with their start/end times and specify which events have higher priority. The model then generates an optimized schedule that prioritizes important tasks and aims to maximize the total duration of selected events. This is for researchers and developers experimenting with reinforcement learning for large language models.
264 stars. No commits in the last 6 months.
Use this if you are a researcher or developer interested in novel approaches to training LLMs with reinforcement learning without explicit examples.
Not ideal if you need a production-ready scheduling tool that reliably avoids all event overlaps, as the current model still struggles with this specific constraint.
Stars
264
Forks
16
Language
Jupyter Notebook
License
Apache-2.0
Category
Last pushed
Apr 29, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/anakin87/qwen-scheduler-grpo"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
axolotl-ai-cloud/axolotl
Go ahead and axolotl questions
google/paxml
Pax is a Jax-based machine learning framework for training large scale models. Pax allows for...
JosefAlbers/PVM
Phi-3.5 for Mac: Locally-run Vision and Language Models for Apple Silicon
iamarunbrahma/finetuned-qlora-falcon7b-medical
Finetuning of Falcon-7B LLM using QLoRA on Mental Health Conversational Dataset
h2oai/h2o-wizardlm
Open-Source Implementation of WizardLM to turn documents into Q:A pairs for LLM fine-tuning