svilupp/Julia-LLM-Leaderboard
Provides a platform for the Julia community to compare AI models' abilities in generating syntactically correct Julia code, featuring structured tests and automated evaluations for easy and collaborative benchmarking.
This project helps Julia developers and data scientists evaluate how well different AI models can generate correct Julia code. You input various AI models and prompting strategies, and it outputs a leaderboard showing their performance in parsing, executing, and passing tests with the generated code. It's designed for anyone in the Julia community who needs to choose the best AI model for their code generation tasks.
No commits in the last 6 months.
Use this if you are a Julia developer or data scientist looking to compare the code generation capabilities of various AI models to select the most suitable one for your projects.
Not ideal if you are looking for a benchmark of AI models in languages other than Julia, or if your primary need is for academic, theoretical evaluations rather than practical, 'does it work' testing.
Stars
86
Forks
10
Language
HTML
License
MIT
Category
Last pushed
Aug 14, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/generative-ai/svilupp/Julia-LLM-Leaderboard"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Featured in
Higher-rated alternatives
openvinotoolkit/model_server
A scalable inference server for models optimized with OpenVINO™
madroidmaq/mlx-omni-server
MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically...
NVIDIA-NeMo/Guardrails
NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based...
generative-computing/mellea
Mellea is a library for writing generative programs.
rhesis-ai/rhesis
Open-source platform & SDK for testing LLM and agentic apps. Define expected behavior, generate...