yueyueL/ReliableLM4Code
Collections of research, benchmarks and tools towards more robust and reliable language models for code; LM4Code; LM4SE; reliable LLM; LLM4Code
This resource helps software engineering researchers and practitioners understand and address the common pitfalls that hinder the reliability of large language models used for code intelligence tasks. It provides a curated collection of research papers, benchmarks, and tools. The input is existing research and models for code-related tasks, and the output is a clearer understanding of potential issues and solutions to build more robust systems. This is for anyone researching, developing, or deploying AI-powered tools for code, such as automated bug repair or test case generation.
No commits in the last 6 months.
Use this if you are working with large language models for software engineering tasks and need to identify, understand, and mitigate potential reliability issues and pitfalls in their design or application.
Not ideal if you are looking for an off-the-shelf development library or a tutorial on basic LLM implementation for code, rather than research insights into reliability challenges.
Stars
30
Forks
2
Language
—
License
—
Category
Last pushed
Dec 14, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ai-coding/yueyueL/ReliableLM4Code"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
k4black/codebleu
Pip compatible CodeBLEU metric implementation available for linux/macos/win
LiveCodeBench/LiveCodeBench
Official repository for the paper "LiveCodeBench: Holistic and Contamination Free Evaluation of...
EdinburghNLP/code-docstring-corpus
Preprocessed Python functions and docstrings for automated code documentation (code2doc) and...
hendrycks/apps
APPS: Automated Programming Progress Standard (NeurIPS 2021)
solis-team/Hydra
[FSE 2026] Do Not Treat Code as Natural Language: Implications for Repository-Level Code...