onerun-ai/onerun
Open-source framework for stress-testing LLMs and conversational AI. Identify hallucinations, policy violations, and edge cases with scalable, realistic simulations. Join the discord: https://discord.gg/ssd4S37WNW
This project helps AI product managers, QA engineers, and conversational AI designers rigorously test their large language models (LLMs) and AI agents. It takes your AI agent and simulates diverse, realistic user conversations at scale to identify issues. The output is evaluation datasets with judge-labeled conversations and training data to improve your AI.
No commits in the last 6 months.
Use this if you need to thoroughly stress-test your AI agents and LLMs for hallucinations, policy violations, and unexpected edge cases before they reach your users.
Not ideal if you are looking for a simple, no-code solution, as this requires a basic understanding of Docker and local environment setup.
Stars
18
Forks
—
Language
Python
License
Apache-2.0
Category
Last pushed
Sep 15, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/agents/onerun-ai/onerun"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
4thfever/cultivation-world-simulator
基于 AI Agent 工作流的修仙世界模拟器,旨在还原智能、开放的仙侠世界。| An open-source Cultivation World Simulator using...
nikmcfly/MiroFish-Offline
Offline multi-agent simulation & prediction engine. English fork of MiroFish with Neo4j + Ollama...
oil-oil/wolfcha
AI-powered Werewolf (Mafia) social deduction game where every player is controlled by top LLMs...
KsanaDock/Microverse
A god-simulation sandbox game built on Godot 4 as a multi-agent AI social simulation system. In...
yasserfarouk/negmas
Negotiation Multi-Agent System (A negotiation library designed for situated negotiations within...