danilop/llm-test-mate

A simple testing framework to evaluate and validate LLM-generated content using string similarity, semantic similarity, and model-based evaluation.

/ 100

Experimental

This tool helps developers working with Large Language Models (LLMs) to ensure the quality and accuracy of generated text. It takes LLM-generated text and compares it against known reference texts using various methods, including how similar the words are, how similar the meaning is, and even using another LLM to act as a judge. The output provides similarity scores and pass/fail statuses, which helps in validating and improving LLM outputs.

No commits in the last 6 months.

Use this if you are a developer building applications with LLMs and need a systematic way to test whether your LLM's outputs meet expected standards for accuracy and content.

Not ideal if you are looking for a no-code solution or a tool for general text analysis unrelated to LLM output validation.

LLM development AI model validation natural language processing text generation quality software testing

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 5 / 25

Maturity 16 / 25

Community 7 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Featured in

You're Shipping AI You Can't Measure

Higher-rated alternatives

openvinotoolkit/model_server

A scalable inference server for models optimized with OpenVINO™

madroidmaq/mlx-omni-server

MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically...

NVIDIA-NeMo/Guardrails

NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based...

generative-computing/mellea

Mellea is a library for writing generative programs.

rhesis-ai/rhesis

Open-source platform & SDK for testing LLM and agentic apps. Define expected behavior, generate...

Explore Generative AI Tools

All categories Trending Generative AI directory Insights