OussamaSghaier/CuREV

Harnessing Large Language Models for Curated Code Reviews

34
/ 100
Emerging

When training AI models to assist with software development tasks like suggesting code improvements or generating feedback, the quality of the training data is crucial. This project provides a curated dataset of code review comments, designed to improve how well Large Language Models understand and generate helpful, relevant code review feedback. It takes raw code review data and processes it into a higher-quality format, primarily benefiting AI researchers and machine learning engineers working on code-related LLMs.

No commits in the last 6 months.

Use this if you are an AI researcher or machine learning engineer looking for a higher-quality dataset to train or fine-tune large language models for tasks involving code review comment generation or code refinement.

Not ideal if you are a software developer looking for a tool to automate your code reviews directly, as this is a dataset and framework for training models, not an end-user application.

AI-training-data natural-language-processing software-engineering-AI machine-learning-research code-review-automation
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 6 / 25
Maturity 16 / 25
Community 12 / 25

How are scores calculated?

Stars

18

Forks

3

Language

Python

License

MIT

Last pushed

Mar 19, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/OussamaSghaier/CuREV"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.