rhesis-ai/rhesis

Open-source platform & SDK for testing LLM and agentic apps. Define expected behavior, generate and run test scenarios, and review failures collaboratively.

/ 100

Established

This platform helps teams ensure their AI applications, like chatbots or intelligent agents, work correctly and safely before they go live. You provide plain language requirements and context, and it generates comprehensive test scenarios. The output shows if your AI meets expectations, avoids harmful content, and retains information, making it useful for product managers, domain experts, and engineers building AI-powered products.

296 stars. Available on PyPI.

Use this if you need a collaborative way to test your LLM or agentic applications against defined requirements and potential vulnerabilities.

Not ideal if you are looking for a post-production monitoring solution rather than a pre-production validation tool.

AI-product-development conversational-AI-testing LLM-quality-assurance AI-safety-testing agentic-system-validation

Maintenance 10 / 25

Adoption 10 / 25

Maturity 25 / 25

Community 13 / 25

How are scores calculated?

Stars

296

Forks

Language

Python

License

—

Featured in

You're Shipping AI You Can't Measure

Related tools

openvinotoolkit/model_server

A scalable inference server for models optimized with OpenVINO™

madroidmaq/mlx-omni-server

MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically...

NVIDIA-NeMo/Guardrails

NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based...

generative-computing/mellea

Mellea is a library for writing generative programs.

taco-group/OpenEMMA

OpenEMMA, a permissively licensed open source "reproduction" of Waymo’s EMMA model.

Explore Generative AI Tools

All categories Trending Generative AI directory Insights