danilop/llm-test-mate

A simple testing framework to evaluate and validate LLM-generated content using string similarity, semantic similarity, and model-based evaluation.

28
/ 100
Experimental

This tool helps developers working with Large Language Models (LLMs) to ensure the quality and accuracy of generated text. It takes LLM-generated text and compares it against known reference texts using various methods, including how similar the words are, how similar the meaning is, and even using another LLM to act as a judge. The output provides similarity scores and pass/fail statuses, which helps in validating and improving LLM outputs.

No commits in the last 6 months.

Use this if you are a developer building applications with LLMs and need a systematic way to test whether your LLM's outputs meet expected standards for accuracy and content.

Not ideal if you are looking for a no-code solution or a tool for general text analysis unrelated to LLM output validation.

LLM development AI model validation natural language processing text generation quality software testing
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 5 / 25
Maturity 16 / 25
Community 7 / 25

How are scores calculated?

Stars

10

Forks

1

Language

Python

License

MIT

Last pushed

Jan 23, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/generative-ai/danilop/llm-test-mate"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.