jszheng21/RACE

RACE is a multi-dimensional benchmark for code generation that focuses on Readability, mAintainability, Correctness, and Efficiency.

27
/ 100
Experimental

RACE helps evaluate how well large language models (LLMs) generate computer code. It takes code generated by an LLM and assesses it across multiple dimensions like readability, maintainability, correctness, and efficiency. The output is a detailed report on the code's quality, which can be used by AI researchers and developers to compare and improve code generation models.

No commits in the last 6 months.

Use this if you are developing or comparing large language models and need a comprehensive way to benchmark their ability to produce high-quality, practical code.

Not ideal if you are an end-user simply looking to use an LLM for code generation without needing to benchmark its underlying performance characteristics.

AI model evaluation code generation LLM benchmarking software engineering AI research
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 5 / 25
Maturity 16 / 25
Community 6 / 25

How are scores calculated?

Stars

12

Forks

1

Language

Python

License

Apache-2.0

Last pushed

Oct 12, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ai-coding/jszheng21/RACE"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.