Code Model Training AI Coding Tools
Tools and frameworks for pre-training, fine-tuning, and optimizing language models specifically for code generation and programming tasks. Does NOT include inference-only tools, deployment platforms, or general LLM training frameworks.
There are 68 code model training tools tracked. 2 score above 50 (established tier). The highest-rated is k4black/codebleu at 56/100 with 130 stars.
Get all 68 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=ai-coding&subcategory=code-model-training&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
k4black/codebleu
Pip compatible CodeBLEU metric implementation available for linux/macos/win |
|
Established |
| 2 |
LiveCodeBench/LiveCodeBench
Official repository for the paper "LiveCodeBench: Holistic and Contamination... |
|
Established |
| 3 |
EdinburghNLP/code-docstring-corpus
Preprocessed Python functions and docstrings for automated code... |
|
Emerging |
| 4 |
hendrycks/apps
APPS: Automated Programming Progress Standard (NeurIPS 2021) |
|
Emerging |
| 5 |
solis-team/Hydra
[FSE 2026] Do Not Treat Code as Natural Language: Implications for... |
|
Emerging |
| 6 |
alxschwrz/codex_py2cpp
Converts python code into c++ by using OpenAI CODEX. |
|
Emerging |
| 7 |
AS-SiliconMind/SiliconMind-V1
Inference Engine for SiliconMind-V1 Verilog Coding Models |
|
Emerging |
| 8 |
tongye98/Awesome-Code-Benchmark
A comprehensive code domain benchmark review of LLM researches. |
|
Emerging |
| 9 |
reddy-lab-code-research/PPOCoder
Code for the TMLR 2023 paper "PPOCoder: Execution-based Code Generation... |
|
Emerging |
| 10 |
bharathsudharsan/OTA-TinyML
Code for IEEE Internet Computing Journal paper 'OTA-TinyML: Over the Air... |
|
Emerging |
| 11 |
logpai/LogBench
A benchmark for logging statement generation. |
|
Emerging |
| 12 |
s2e-lab/Code-Smell-Code-Generation
Source code for "An Empirical Study of Code Smells in Transformer-based Code... |
|
Emerging |
| 13 |
zorazrw/odex
[EMNLP'23] Execution-Based Evaluation for Open Domain Code Generation |
|
Emerging |
| 14 |
vl2g/floco
Flow Chart Image-to-Code Generation |
|
Emerging |
| 15 |
code-gen/cscg
Code Generation as a Dual Task of Code Summarization. |
|
Emerging |
| 16 |
CloudIDEaaS-zz/hydra
Hydra is a app generation product. Hydra aims to reduce the "concept to... |
|
Emerging |
| 17 |
99EnriqueD/verilog_autocompletion
Code implementation for "A Deep Learning Framework for Verilog... |
|
Emerging |
| 18 |
s2e-lab/SecurityEval
Repository for "SecurityEval Dataset: Mining Vulnerability Examples to... |
|
Emerging |
| 19 |
devashish-gupta/Geode
A zero-shot geospatial question answering agent with precise spatiotemporal... |
|
Emerging |
| 20 |
matlab-deep-learning/Deep_Learning_Poker_Player_using_MATLAB_and_Raspberry_Pi
This example shows how to use automatic code generation to deploy a deep... |
|
Emerging |
| 21 |
Gen-Verse/CURE
[NeurIPS 2025 Spotlight] Co-Evolving LLM Coder and Unit Tester via... |
|
Emerging |
| 22 |
madaan/pie-perf
Training language models to make programs faster |
|
Emerging |
| 23 |
formula-code/fc-eval
Evaluation harness for FormulaCode |
|
Emerging |
| 24 |
WebPAI/Interaction2Code
[ASE 2025] Benchmarking MLLM-based Interactive Webpage Code Generation from... |
|
Emerging |
| 25 |
Pavansomisetty21/Automated-Code-Generation-and-Execution-Agent-using-LangChain-and-Cohere-LLM
In this we implement an agent which generates and executes code using cohere... |
|
Emerging |
| 26 |
Rudra5417/Code-Generator-using-GPT-3
Natural Language to Code |
|
Experimental |
| 27 |
HIT-SCIR/Abacus
珠算代码大模型(Abacus Code LLM) |
|
Experimental |
| 28 |
HySonLab/Design2Code
Large Language Model in combination with Large Vision Model for the task of... |
|
Experimental |
| 29 |
matthewdeanmartin/paipi
Pypi search, except the backend is an LLM's pixelated memory of Pypi. |
|
Experimental |
| 30 |
aswathselvam/Potholes
Realtime pothole detection on Android phone's IMU data. SVM model in C++, ... |
|
Experimental |
| 31 |
aixcoder-plugin/nl2code-dataset
Aix-bench, the Java benchmark for code synthesis problem. |
|
Experimental |
| 32 |
jszheng21/RACE
RACE is a multi-dimensional benchmark for code generation that focuses on... |
|
Experimental |
| 33 |
domaineval/DomainEval
DOMAINEVAL is an auto-constructed benchmark for multi-domain code generation... |
|
Experimental |
| 34 |
KohlerHECTOR/interpreter-py
Implementation of Interpretable and Editable Programmatic Tree Policies for... |
|
Experimental |
| 35 |
albertusk95/intention-to-code-lstm
Source Code Generation Based On User Intention Using LSTM Networks |
|
Experimental |
| 36 |
seal-research/OmniCode
OmniCode: A Diverse Software Engineering Benchmark for Evaluating Large... |
|
Experimental |
| 37 |
CodeEff/ECCO
[EMNLP 2024] Code for the paper "ECCO: Can We Improve Model-Generated Code... |
|
Experimental |
| 38 |
medxiaorudan/CodeGeneration
Prompt engineering with Langchain and fine-tuning the CodeLlama model. The... |
|
Experimental |
| 39 |
formula-code/terminal-bench
Evaluation harness for FormulaCode |
|
Experimental |
| 40 |
LiuZeJie97/Code-Generation-From-Flowcharts-with-Texts-A-Benchmark-Dataset-and-An-Approach
Code for the paper "Code Generation From Flowcharts with Texts: A Benchmark... |
|
Experimental |
| 41 |
yunbow/ai-dev-os-benchmark
Benchmark: how AI coding guidelines affect code quality — 3 conditions × 9... |
|
Experimental |
| 42 |
adpena/vertigo-lora
Domain-specialized LoRA fine-tuning pipeline for Roblox/Luau code generation... |
|
Experimental |
| 43 |
kroq86/honeybadger
formal VM benchmark and inspectable reasoning runtime for testing whether... |
|
Experimental |
| 44 |
sephirxth/LLM_code_test
LLM code generation benchmark — Claude vs Gemini vs DeepSeek vs Grok on a... |
|
Experimental |
| 45 |
LIANGQINGYUAN/Lyra
Lyra: A Benchmark for Turducken-Style Code Generation |
|
Experimental |
| 46 |
Meisdy/Speech-to-Code-Generation-for-Collaborative-Robots
A modular pipeline that lets users program collaborative robots through... |
|
Experimental |
| 47 |
yueyueL/ReliableLM4Code
Collections of research, benchmarks and tools towards more robust and... |
|
Experimental |
| 48 |
ftrou/Decodifier
**The Compiler for AI-Generated Software** **LLMs don’t write code.** ... |
|
Experimental |
| 49 |
kabirjaipal/Evil-Codes
Evil Codes is a repository where you will find many useful code snippets and... |
|
Experimental |
| 50 |
jacopotagliabue/LLMs-to-Alloy
Example of LLM generated Alloy code for deductive reasoning from English... |
|
Experimental |
| 51 |
sssszh/CodePLAN
The code repository for the paper “Enhancing Code Generation Performance of... |
|
Experimental |
| 52 |
falconvn2006/GPasT
GPT for Pascal code generation :) |
|
Experimental |
| 53 |
AngelicaArabe/OTA-IOT
🔧 Develop IoT applications with ESP32-S3 using OTA updates, SPIFFS web... |
|
Experimental |
| 54 |
ada994/prism-bench
🌐 Benchmark models using the PRISM framework and access the FLUX-Reason-6M... |
|
Experimental |
| 55 |
ALM3ARQ/character-prefix-conditioning
🔍 Streamline token sampling with character prefix conditioning using a... |
|
Experimental |
| 56 |
cloudrishi/springboot-ai-generator
AI-powered Spring Boot code generator using CodeLlama LLM running locally via Ollama |
|
Experimental |
| 57 |
gokhanercan/gen-atomic
An LLM-based code generation framework aims to support a wide range of... |
|
Experimental |
| 58 |
HWH-2000/DynaCode
[ACL'2025 Findings] DynaCode: A Dynamic Complexity-Aware Code Benchmark for... |
|
Experimental |
| 59 |
AshrafMorningstar/omni-code-polyglot
A massive, SEO‑optimized collection of 300+ ready‑to‑run code snippets in... |
|
Experimental |
| 60 |
Bifrost-Technologies/Prometheus
A developer platform for generating complete Solana programs in one-shot... |
|
Experimental |
| 61 |
przeprogramowani/10x-bench-eval
Scoring criteria for 10x-bench (10xbench.ai) |
|
Experimental |
| 62 |
evalops/llmcc
LLM-native compiler toolchain - implementing 'LLM ≈ probabilistic compiler'... |
|
Experimental |
| 63 |
Jayveersinh-Raj/code_generation_gpt2
Fine tuning a gpt2 model for code generation/completion. This is the work... |
|
Experimental |
| 64 |
navneetprabhakar/telegram-bot-llm
Telegram bot with LLM code gen capabilities |
|
Experimental |
| 65 |
runaicode/ai-coding-benchmarks
Standardized test prompts and benchmarks for evaluating AI coding... |
|
Experimental |
| 66 |
moritzWa/BugDetectionBench
A benchmark dataset of real-world code review comments, designed to evaluate... |
|
Experimental |
| 67 |
rajat-kumar-thakur/LLMs-for-Resource-Constrained-Devices
This work was done as part of SRIP 2025 Internship, IIT Gandhinagar |
|
Experimental |
| 68 |
rudijetson/grammar-ops
LLM-native codebase grammar system - Transform natural language patterns... |
|
Experimental |