madaan/pie-perf
Training language models to make programs faster
This project offers a specialized dataset for machine learning engineers and researchers focused on program optimization. It provides pairs of code snippets (in Python and C++) for competitive programming problems, where one version is measurably faster than the other. The dataset includes rich metadata like CPU time, memory usage, and the percentage of performance improvement, allowing researchers to train and evaluate models that can automatically generate performance-improving code edits.
No commits in the last 6 months.
Use this if you are a machine learning researcher or engineer developing and evaluating AI models for code optimization and performance improvement, and you need a robust dataset with clear performance metrics.
Not ideal if you are looking for a tool that directly optimizes your code or provides ready-to-use performance improvements for your production systems.
Stars
98
Forks
14
Language
Jupyter Notebook
License
—
Category
Last pushed
Apr 16, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ai-coding/madaan/pie-perf"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
k4black/codebleu
Pip compatible CodeBLEU metric implementation available for linux/macos/win
LiveCodeBench/LiveCodeBench
Official repository for the paper "LiveCodeBench: Holistic and Contamination Free Evaluation of...
EdinburghNLP/code-docstring-corpus
Preprocessed Python functions and docstrings for automated code documentation (code2doc) and...
hendrycks/apps
APPS: Automated Programming Progress Standard (NeurIPS 2021)
solis-team/Hydra
[FSE 2026] Do Not Treat Code as Natural Language: Implications for Repository-Level Code...