oripress/AlgoTune

AlgoTune is a NeurIPS 2025 benchmark made up of 154 math, physics, and computer science problems. The goal is write code that solves each problem, and is faster than existing implementations.

/ 100

Emerging

AlgoTune helps you benchmark how well large language models can optimize code for common math, physics, and computer science functions. You provide existing code and select an AI model; AlgoTune then outputs new code versions and detailed speed-up reports. This is for researchers and engineers who want to assess or improve the performance optimization capabilities of AI models.

Use this if you need to systematically evaluate how effectively large language models can generate faster, equivalent code for numerical problems.

Not ideal if you are looking for a tool to automatically fix bugs in your existing code or to generate code from natural language prompts without a focus on performance optimization.

algorithm-optimization computational-performance scientific-computing AI-code-generation performance-benchmarking

No Package No Dependents

Maintenance 10 / 25

Adoption 9 / 25

Maturity 15 / 25

Community 15 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Related models

xjywhu/Awesome-Multimodal-LLM-for-Code

Multimodal Large Language Models for Code Generation under Multimodal Scenarios

jie-jw-wu/human-eval-comm

HumanEvalComm: Evaluating Communication Skill of Code LLM and LLM Agent

juyongjiang/CodeUp

CodeUp: A Multilingual Code Generation Llama-X Model with Parameter-Efficient Instruction-Tuning

JHansiduYapa/Fine-Tuning-a-Small-Language-Model-for-Cypher-Query-Generation

This project fine-tunes Unsloth's Gemma-3 4B IT (4-bit) model to translate natural language into...

Gen-Verse/ReasonFlux

[NeurIPS 2025 Spotlight] LLM post-training suite — featuring ReasonFlux, ReasonFlux-PRM, and...

Explore Transformer Models

All categories Trending Transformer directory Insights