Llm Comparison Evaluation Prompt Engineering Tools

There are 5 llm comparison evaluation tools tracked. The highest-rated is ExpertiseModel/MuTAP at 33/100 with 54 stars.

Get all 5 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=prompt-engineering&subcategory=llm-comparison-evaluation&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 ExpertiseModel/MuTAP

MutAP: A prompt_based learning technique to automatically generate test...

33
Emerging
2 INPVLSA/probefish

A web-based LLM prompt and endpoint testing platform. Organize, version,...

27
Experimental
3 thabit-ai/thabit

Thabit is platform to evaluate prompts on multiple LLMs to determine the...

17
Experimental
4 nicolay-r/llm-prompt-checking

Toolset for checking differences in recognising semantic relation presence...

11
Experimental
5 alexandrughinea/lm-tiny-prompt-evaluation-framework

This project provides a tiny framework for testing different prompt versions...

10
Experimental