sail-sg/oat

🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.

53
/ 100
Established

This framework helps AI researchers and practitioners rapidly experiment with and develop new algorithms for aligning large language models (LLMs) to human preferences or specific behaviors online. It takes LLM responses and feedback (like preferences or verifiable rewards) to produce a refined, better-performing LLM. It's designed for those working on improving how LLMs interact and respond in real-time.

638 stars.

Use this if you are an AI researcher or machine learning engineer focused on developing or evaluating online alignment algorithms for LLMs.

Not ideal if you are a developer looking for a simple, out-of-the-box solution to fine-tune an LLM without deep involvement in algorithm research.

LLM-research online-learning reinforcement-learning preference-learning AI-alignment
No Package No Dependents
Maintenance 10 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 17 / 25

How are scores calculated?

Stars

638

Forks

60

Language

Python

License

Apache-2.0

Last pushed

Jan 29, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/sail-sg/oat"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.