Llm Comparison Evaluation Prompt Engineering Tools

There are 5 llm comparison evaluation tools tracked. The highest-rated is ExpertiseModel/MuTAP at 33/100 with 54 stars.

Get all 5 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=prompt-engineering&subcategory=llm-comparison-evaluation&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Tool	Score	Tier	Stars	Language
1	ExpertiseModel/MuTAP MutAP: A prompt_based learning technique to automatically generate test...	33	Emerging	54	Python
2	INPVLSA/probefish A web-based LLM prompt and endpoint testing platform. Organize, version,...	27	Experimental	6	TypeScript
3	thabit-ai/thabit Thabit is platform to evaluate prompts on multiple LLMs to determine the...	17	Experimental	1	HTML
4	nicolay-r/llm-prompt-checking Toolset for checking differences in recognising semantic relation presence...	11	Experimental	—	Python
5	alexandrughinea/lm-tiny-prompt-evaluation-framework This project provides a tiny framework for testing different prompt versions...	10	Experimental	1	JavaScript