LeonYang95/LLM4UT
Evaluation code of ASE24 accepted paper "On the Evaluation of LLM in Unit Test Generation"
This project helps software quality engineers and researchers evaluate how well large language models (LLMs) generate unit tests for Java projects. You provide Java codebases and LLM responses to generated test prompts, and it outputs a detailed analysis of the LLM's test generation quality, including compilation success and effectiveness. It's designed for those assessing the practical application of LLMs in software testing workflows.
No commits in the last 6 months.
Use this if you are a software quality engineer or researcher needing to systematically evaluate the performance of different LLMs in generating unit tests for Java applications.
Not ideal if you are looking for a tool to generate unit tests directly or to improve your existing testing process without focusing on LLM evaluation.
Stars
13
Forks
1
Language
HTML
License
MulanPSL-2.0
Category
Last pushed
Dec 09, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/LeonYang95/LLM4UT"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Featured in
Higher-rated alternatives
open-compass/opencompass
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral,...
IBM/unitxt
🦄 Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the...
lean-dojo/LeanDojo
Tool for data extraction and interacting with Lean programmatically.
GoodStartLabs/AI_Diplomacy
Frontier Models playing the board game Diplomacy.
google/litmus
Litmus is a comprehensive LLM testing and evaluation tool designed for GenAI Application...