LLM Testing Frameworks Prompt Engineering Tools

Tools for systematically testing, evaluating, and validating LLM-powered applications through unit tests, integration tests, regression detection, and failure analysis. Does NOT include prompt optimization, monitoring/observability, or general testing frameworks without LLM-specific features.

There are 37 llm testing frameworks tools tracked. The highest-rated is genieincodebottle/schemalock at 43/100 with 1 stars.

Get all 37 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=prompt-engineering&subcategory=llm-testing-frameworks&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 genieincodebottle/schemalock

LLM output contract testing CLI, define what your pipeline must return, test...

43
Emerging
2 antsanchez/prompto

Interact with various LLMs in your browser (LangChain.js, Angular)

39
Emerging
3 Coolhand-Labs/coolhand-ruby

Zero-config LLM cost & quality monitoring for Ruby apps - automatically log...

38
Emerging
4 joshualamerton/prompt-trace

Prompt and response tracing for LLM workflows

35
Emerging
5 suhjohn/llm-workbench

UI for testing prompts across various datasets locally

35
Emerging
6 atjsh/llmlingua-2-js

JavaScript/TypeScript implementation of LLMLingua-2 (Experimental)

34
Emerging
7 dzhng/llamaflow

The Typescript-first prompt engineering toolkit for working with chat based LLMs.

34
Emerging
8 adarshM84/TextLLaMA

Transform your writing with TextLLaMA! ✍️🚀 Simplify grammar, translate...

34
Emerging
9 Coolhand-Labs/coolhand-python

Zero-config LLM cost & quality monitoring for Python apps - automatically...

33
Emerging
10 parea-ai/parea-sdk-ts

TypeScript SDK for experimenting, testing, evaluating & monitoring...

31
Emerging
11 drorIvry/consisTent

A Comprehensive Testing Framework for Prompts

31
Emerging
12 Cre4T3Tiv3/llm-prompt-debugger

Clean UI for LLM development workflows with prompt versioning and model...

28
Experimental
13 sazed5055/llmtest

pytest for LLM apps - Test for grounding failures, prompt injection,...

25
Experimental
14 anurag-aryan-tech/Mafia-Mediator-Dashboard

A Python + Tkinter desktop dashboard for mediating Mafia games with LLM...

24
Experimental
15 elijahmuimi/llm-log

Provide structured JSONL logging for large language models to simplify data...

22
Experimental
16 yasemineren/Typesentry

LLM evaluation harness for TypeScript: adversarial suites, static checks,...

21
Experimental
17 CodeForgeNet/tuneprompt

Industrial-grade testing framework for LLM prompts

21
Experimental
18 VebjornNyvoll/promptcanary

Lightweight prompt regression testing for your existing test suite. Test LLM...

21
Experimental
19 poyro/poyro

Test your web app LLM integrations using existing testing frameworks....

21
Experimental
20 rawveg/intellillm-playground

LLM Playground that works with Open Router

21
Experimental
21 RahulMK22/llmtest

🚀 Comprehensive testing framework for LLM applications with semantic...

20
Experimental
22 calibrtr/llm-prompt-test

LLM Prompt Test helps you test Large Language Models (LLMs) prompts to...

20
Experimental
23 pavankumarinfo/ai-testing-healthcare

Public whitepaper on AI testing strategies in healthcare using prompt...

19
Experimental
24 Yuankai619/LLM-Generated-web-and-Playwright-E2E-Testing

Experiment about using LLM to generate web pages that meet the requirements...

19
Experimental
25 WilliamK112/prompttrace

Prompt engineering and LLM evaluation framework with trace visualization,...

17
Experimental
26 suzakuzhang/tarot-local-test

An AI tarot reading web app with fixed card meanings and LLM-generated...

15
Experimental
27 Mattbusel/prompt-observatory

Unified LLM interpretability dashboard — real-time token streams,...

15
Experimental
28 YagneshKhamar/phasio

Jest-style testing for LLM prompts. Version prompts, run evals across OpenAI...

14
Experimental
29 chrifyessmine02/LLM4UnitTests-SC

LLM4UnitTests-SC bridges AI and Blockchain Engineering by leveraging LLMs to...

14
Experimental
30 KristopherZlo/promptlab

Evala is a team workspace for prompt engineering, AI experiments,...

14
Experimental
31 amitpuri/llm-playground

LLM Playground - Demo Solution

13
Experimental
32 quantiauy/llmunit

LLMUnit is a developer-first platform designed to bring the rigors of unit...

13
Experimental
33 sphinx010/testAIgnite

TestAIgnite: an enterprise Cypress framework using Llama-3, Mixtral, and...

13
Experimental
34 Omnia9789/ai-unit-test-generator-cli

LLM-powered Python test generaunit-testingtor CLI with single-function...

13
Experimental
35 cktang88/system-prompt-tester

Test system prompts

13
Experimental
36 RafalWilinski/prompt-testing-framework

Test how good your prompts are against the expected results.

12
Experimental
37 moeki0/promptest

The Prompt testing library for LLM that allows comparing patterns of prompts.

11
Experimental

Comparisons in this category