LLM Testing Frameworks Prompt Engineering Tools

Tools for systematically testing, evaluating, and validating LLM-powered applications through unit tests, integration tests, regression detection, and failure analysis. Does NOT include prompt optimization, monitoring/observability, or general testing frameworks without LLM-specific features.

There are 37 llm testing frameworks tools tracked. The highest-rated is genieincodebottle/schemalock at 43/100 with 1 stars.

Get all 37 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=prompt-engineering&subcategory=llm-testing-frameworks&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Tool	Score	Tier	Stars	Language
1	genieincodebottle/schemalock LLM output contract testing CLI, define what your pipeline must return, test...	43	Emerging	1	JavaScript
2	antsanchez/prompto Interact with various LLMs in your browser (LangChain.js, Angular)	39	Emerging	17	HTML
3	Coolhand-Labs/coolhand-ruby Zero-config LLM cost & quality monitoring for Ruby apps - automatically log...	38	Emerging	9	Ruby
4	joshualamerton/prompt-trace Prompt and response tracing for LLM workflows	35	Emerging	2	Python
5	suhjohn/llm-workbench UI for testing prompts across various datasets locally	35	Emerging	13	TypeScript
6	atjsh/llmlingua-2-js JavaScript/TypeScript implementation of LLMLingua-2 (Experimental)	34	Emerging	22	TypeScript
7	dzhng/llamaflow The Typescript-first prompt engineering toolkit for working with chat based LLMs.	34	Emerging	112	TypeScript
8	adarshM84/TextLLaMA Transform your writing with TextLLaMA! ✍️🚀 Simplify grammar, translate...	34	Emerging	15	HTML
9	Coolhand-Labs/coolhand-python Zero-config LLM cost & quality monitoring for Python apps - automatically...	33	Emerging	1	Python
10	parea-ai/parea-sdk-ts TypeScript SDK for experimenting, testing, evaluating & monitoring...	31	Emerging	4	TypeScript
11	drorIvry/consisTent A Comprehensive Testing Framework for Prompts	31	Emerging	3	Python
12	Cre4T3Tiv3/llm-prompt-debugger Clean UI for LLM development workflows with prompt versioning and model...	28	Experimental	48	TypeScript
13	sazed5055/llmtest pytest for LLM apps - Test for grounding failures, prompt injection,...	25	Experimental	3	Python
14	anurag-aryan-tech/Mafia-Mediator-Dashboard A Python + Tkinter desktop dashboard for mediating Mafia games with LLM...	24	Experimental	1	Python
15	elijahmuimi/llm-log Provide structured JSONL logging for large language models to simplify data...	22	Experimental	—	C++
16	yasemineren/Typesentry LLM evaluation harness for TypeScript: adversarial suites, static checks,...	21	Experimental	—	TypeScript
17	CodeForgeNet/tuneprompt Industrial-grade testing framework for LLM prompts	21	Experimental	—	TypeScript
18	VebjornNyvoll/promptcanary Lightweight prompt regression testing for your existing test suite. Test LLM...	21	Experimental	—	TypeScript
19	poyro/poyro Test your web app LLM integrations using existing testing frameworks....	21	Experimental	40	TypeScript
20	rawveg/intellillm-playground LLM Playground that works with Open Router	21	Experimental	—	TypeScript
21	RahulMK22/llmtest 🚀 Comprehensive testing framework for LLM applications with semantic...	20	Experimental	1	Python
22	calibrtr/llm-prompt-test LLM Prompt Test helps you test Large Language Models (LLMs) prompts to...	20	Experimental	5	TypeScript
23	pavankumarinfo/ai-testing-healthcare Public whitepaper on AI testing strategies in healthcare using prompt...	19	Experimental	2	—
24	Yuankai619/LLM-Generated-web-and-Playwright-E2E-Testing Experiment about using LLM to generate web pages that meet the requirements...	19	Experimental	13	TypeScript
25	WilliamK112/prompttrace Prompt engineering and LLM evaluation framework with trace visualization,...	17	Experimental	1	HTML
26	suzakuzhang/tarot-local-test An AI tarot reading web app with fixed card meanings and LLM-generated...	15	Experimental	1	Python
27	Mattbusel/prompt-observatory Unified LLM interpretability dashboard — real-time token streams,...	15	Experimental	2	Python
28	YagneshKhamar/phasio Jest-style testing for LLM prompts. Version prompts, run evals across OpenAI...	14	Experimental	—	TypeScript
29	chrifyessmine02/LLM4UnitTests-SC LLM4UnitTests-SC bridges AI and Blockchain Engineering by leveraging LLMs to...	14	Experimental	1	JavaScript
30	KristopherZlo/promptlab Evala is a team workspace for prompt engineering, AI experiments,...	14	Experimental	—	PHP
31	amitpuri/llm-playground LLM Playground - Demo Solution	13	Experimental	—	Python
32	quantiauy/llmunit LLMUnit is a developer-first platform designed to bring the rigors of unit...	13	Experimental	—	TypeScript
33	sphinx010/testAIgnite TestAIgnite: an enterprise Cypress framework using Llama-3, Mixtral, and...	13	Experimental	—	JavaScript
34	Omnia9789/ai-unit-test-generator-cli LLM-powered Python test generaunit-testingtor CLI with single-function...	13	Experimental	—	Python
35	cktang88/system-prompt-tester Test system prompts	13	Experimental	—	TypeScript
36	RafalWilinski/prompt-testing-framework Test how good your prompts are against the expected results.	12	Experimental	7	TypeScript
37	moeki0/promptest The Prompt testing library for LLM that allows comparing patterns of prompts.	11	Experimental	—	TypeScript

Comparisons in this category

coolhand-ruby and coolhand-python (38 vs 33)