OSU-NLP-Group/cobalt

Code and data for the paper "Bridging Online and Offline RL: Contextual Bandit Learning for Multi-Turn Code Generation"

26
/ 100
Experimental

Cobalt helps large language models (LLMs) generate more accurate and functional code over multiple steps. It takes existing code generation attempts and uses them to train the LLM to better complete coding tasks one step at a time. This tool is designed for AI researchers and practitioners working on improving LLMs for complex, iterative coding challenges.

Use this if you are an AI researcher or practitioner looking to enhance large language models' ability to generate correct code through multi-turn interactions, especially when balancing training cost and performance.

Not ideal if you are looking for a ready-to-use code generation application rather than a method for training and evaluating underlying LLMs.

AI research code generation reinforcement learning large language models machine learning engineering
No Package No Dependents
Maintenance 10 / 25
Adoption 5 / 25
Maturity 11 / 25
Community 0 / 25

How are scores calculated?

Stars

9

Forks

Language

Python

License

MIT

Last pushed

Feb 04, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/OSU-NLP-Group/cobalt"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.