LeonYang95/LLM4UT

Evaluation code of ASE24 accepted paper "On the Evaluation of LLM in Unit Test Generation"

/ 100

Experimental

This project helps software quality engineers and researchers evaluate how well large language models (LLMs) generate unit tests for Java projects. You provide Java codebases and LLM responses to generated test prompts, and it outputs a detailed analysis of the LLM's test generation quality, including compilation success and effectiveness. It's designed for those assessing the practical application of LLMs in software testing workflows.

No commits in the last 6 months.

Use this if you are a software quality engineer or researcher needing to systematically evaluate the performance of different LLMs in generating unit tests for Java applications.

Not ideal if you are looking for a tool to generate unit tests directly or to improve your existing testing process without focusing on LLM evaluation.

unit-testing software-quality-assurance java-development LLM-evaluation test-automation

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 5 / 25

Maturity 16 / 25

Community 6 / 25

How are scores calculated?

Stars

Forks

Language

HTML

License

MulanPSL-2.0

Featured in

You're Shipping AI You Can't Measure

Higher-rated alternatives

open-compass/opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral,...

IBM/unitxt

🦄 Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the...

lean-dojo/LeanDojo

Tool for data extraction and interacting with Lean programmatically.

GoodStartLabs/AI_Diplomacy

Frontier Models playing the board game Diplomacy.

google/litmus

Litmus is a comprehensive LLM testing and evaluation tool designed for GenAI Application...

Explore LLM Tools

All categories Trending LLM Tool directory Insights