svilupp/Julia-LLM-Leaderboard

Provides a platform for the Julia community to compare AI models' abilities in generating syntactically correct Julia code, featuring structured tests and automated evaluations for easy and collaborative benchmarking.

38
/ 100
Emerging

This project helps Julia developers and data scientists evaluate how well different AI models can generate correct Julia code. You input various AI models and prompting strategies, and it outputs a leaderboard showing their performance in parsing, executing, and passing tests with the generated code. It's designed for anyone in the Julia community who needs to choose the best AI model for their code generation tasks.

No commits in the last 6 months.

Use this if you are a Julia developer or data scientist looking to compare the code generation capabilities of various AI models to select the most suitable one for your projects.

Not ideal if you are looking for a benchmark of AI models in languages other than Julia, or if your primary need is for academic, theoretical evaluations rather than practical, 'does it work' testing.

Julia-programming AI-code-generation developer-tools large-language-models performance-benchmarking
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 9 / 25
Maturity 16 / 25
Community 13 / 25

How are scores calculated?

Stars

86

Forks

10

Language

HTML

License

MIT

Last pushed

Aug 14, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/generative-ai/svilupp/Julia-LLM-Leaderboard"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.