onerun-ai/onerun

Open-source framework for stress-testing LLMs and conversational AI. Identify hallucinations, policy violations, and edge cases with scalable, realistic simulations. Join the discord: https://discord.gg/ssd4S37WNW

23
/ 100
Experimental

This project helps AI product managers, QA engineers, and conversational AI designers rigorously test their large language models (LLMs) and AI agents. It takes your AI agent and simulates diverse, realistic user conversations at scale to identify issues. The output is evaluation datasets with judge-labeled conversations and training data to improve your AI.

No commits in the last 6 months.

Use this if you need to thoroughly stress-test your AI agents and LLMs for hallucinations, policy violations, and unexpected edge cases before they reach your users.

Not ideal if you are looking for a simple, no-code solution, as this requires a basic understanding of Docker and local environment setup.

AI-testing conversational-AI LLM-evaluation AI-QA prompt-engineering
Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 6 / 25
Maturity 15 / 25
Community 0 / 25

How are scores calculated?

Stars

18

Forks

Language

Python

License

Apache-2.0

Last pushed

Sep 15, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/agents/onerun-ai/onerun"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.