Llm Comparison Evaluation Prompt Engineering Tools
There are 5 llm comparison evaluation tools tracked. The highest-rated is ExpertiseModel/MuTAP at 33/100 with 54 stars.
Get all 5 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=prompt-engineering&subcategory=llm-comparison-evaluation&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
ExpertiseModel/MuTAP
MutAP: A prompt_based learning technique to automatically generate test... |
|
Emerging |
| 2 |
INPVLSA/probefish
A web-based LLM prompt and endpoint testing platform. Organize, version,... |
|
Experimental |
| 3 |
thabit-ai/thabit
Thabit is platform to evaluate prompts on multiple LLMs to determine the... |
|
Experimental |
| 4 |
nicolay-r/llm-prompt-checking
Toolset for checking differences in recognising semantic relation presence... |
|
Experimental |
| 5 |
alexandrughinea/lm-tiny-prompt-evaluation-framework
This project provides a tiny framework for testing different prompt versions... |
|
Experimental |