danilop/llm-test-mate
A simple testing framework to evaluate and validate LLM-generated content using string similarity, semantic similarity, and model-based evaluation.
This tool helps developers working with Large Language Models (LLMs) to ensure the quality and accuracy of generated text. It takes LLM-generated text and compares it against known reference texts using various methods, including how similar the words are, how similar the meaning is, and even using another LLM to act as a judge. The output provides similarity scores and pass/fail statuses, which helps in validating and improving LLM outputs.
No commits in the last 6 months.
Use this if you are a developer building applications with LLMs and need a systematic way to test whether your LLM's outputs meet expected standards for accuracy and content.
Not ideal if you are looking for a no-code solution or a tool for general text analysis unrelated to LLM output validation.
Stars
10
Forks
1
Language
Python
License
MIT
Category
Last pushed
Jan 23, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/generative-ai/danilop/llm-test-mate"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Featured in
Higher-rated alternatives
openvinotoolkit/model_server
A scalable inference server for models optimized with OpenVINO™
madroidmaq/mlx-omni-server
MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically...
NVIDIA-NeMo/Guardrails
NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based...
generative-computing/mellea
Mellea is a library for writing generative programs.
rhesis-ai/rhesis
Open-source platform & SDK for testing LLM and agentic apps. Define expected behavior, generate...