yueyueL/ReliableLM4Code

Collections of research, benchmarks and tools towards more robust and reliable language models for code; LM4Code; LM4SE; reliable LLM; LLM4Code

/ 100

Experimental

This resource helps software engineering researchers and practitioners understand and address the common pitfalls that hinder the reliability of large language models used for code intelligence tasks. It provides a curated collection of research papers, benchmarks, and tools. The input is existing research and models for code-related tasks, and the output is a clearer understanding of potential issues and solutions to build more robust systems. This is for anyone researching, developing, or deploying AI-powered tools for code, such as automated bug repair or test case generation.

No commits in the last 6 months.

Use this if you are working with large language models for software engineering tasks and need to identify, understand, and mitigate potential reliability issues and pitfalls in their design or application.

Not ideal if you are looking for an off-the-shelf development library or a tutorial on basic LLM implementation for code, rather than research insights into reliability challenges.

software-engineering-research code-intelligence large-language-models-reliability automated-bug-repair test-case-generation

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 7 / 25

Maturity 8 / 25

Community 6 / 25

How are scores calculated?

Stars

Forks

Language

—

License

—

Higher-rated alternatives

k4black/codebleu

Pip compatible CodeBLEU metric implementation available for linux/macos/win

LiveCodeBench/LiveCodeBench

Official repository for the paper "LiveCodeBench: Holistic and Contamination Free Evaluation of...

EdinburghNLP/code-docstring-corpus

Preprocessed Python functions and docstrings for automated code documentation (code2doc) and...

hendrycks/apps

APPS: Automated Programming Progress Standard (NeurIPS 2021)

solis-team/Hydra

[FSE 2026] Do Not Treat Code as Natural Language: Implications for Repository-Level Code...

Explore AI Coding Tools

All categories Trending AI Coding directory Insights