AI Debate Arenas LLM Tools
Interactive platforms where AI models or humans compete in structured debates with scoring/judging. Includes multi-model comparison tools, live debate staging, and adversarial benchmarking. Does NOT include general competitive game frameworks, coding competitions, or security red-team tools without debate mechanics.
There are 36 ai debate arenas tools tracked. 1 score above 50 (established tier). The highest-rated is betagouv/ComparIA at 52/100 with 63 stars.
Get all 36 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=llm-tools&subcategory=ai-debate-arenas&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
betagouv/ComparIA
Open source LLM arena created by the French Government |
|
Established |
| 2 |
Skytliang/Multi-Agents-Debate
MAD: The first work to explore Multi-Agent Debate with Large Language Models :D |
|
Emerging |
| 3 |
liuxiaotong/ai-dataset-radar
Multi-source async competitive intelligence engine for AI training data... |
|
Emerging |
| 4 |
Arnoldlarry15/ARES-Dashboard
AI Red Team Operations Console |
|
Emerging |
| 5 |
llm-ring/lmring
Open-source, self-hostable LLM arena with model compare, voting, and leaderboards |
|
Emerging |
| 6 |
YerbaPage/SWE-Debate
SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution |
|
Emerging |
| 7 |
khoren93/ai-debates
Orchestrate epic battles between 600+ AI models (GPT-5, Gemini 3, DeepSeek... |
|
Emerging |
| 8 |
rd-serendipity/ai-debate-arena
AI Debate Arena: Streamlit app for AI model debates. Features multi-model... |
|
Emerging |
| 9 |
debate/debate-ai.com
Debate AI enables debaters in PF, LD & Policy to streamline AI research... |
|
Emerging |
| 10 |
jhammant/agent-drift
Stress-test AI agents for goal drift and system prompt violations. Inspired... |
|
Experimental |
| 11 |
the-meta-value/The-Perfect-Storm
tracking AI system failures with solar weather |
|
Experimental |
| 12 |
aramirez-maza/beto-framework
BETO formalizes the ignorance of an AI. Materializes raw semantic intent... |
|
Experimental |
| 13 |
lukeslp/consensus
Multi-model research debate engine — 8+ language models independently... |
|
Experimental |
| 14 |
SurgeCLI/Surge
Surge is a lightweight self-learning AI observability and remediation agent... |
|
Experimental |
| 15 |
dinesh-git17/debate-lab
Watch AI models debate any topic in real-time. ChatGPT and Grok argue,... |
|
Experimental |
| 16 |
13120740298z-lang/AI-Tech-Radar
🤖 Automated AI technology intelligence platform — tracks GitHub AI projects,... |
|
Experimental |
| 17 |
qingni/TechSentry
🛡️ AI-powered tech intelligence tool that auto-tracks GitHub repos, Hacker... |
|
Experimental |
| 18 |
sanifhimani/llm-colosseum
AI models fight each other in a pixel arena every day. They decide what to... |
|
Experimental |
| 19 |
armsp/AIFU
AI flub ups |
|
Experimental |
| 20 |
konradhy/battlearena
A proof of concept illustrating how AI can enhance games |
|
Experimental |
| 21 |
florykhan/TelusGuardAI
AI-powered network impact analyzer. Natural-language queries → multi-agent... |
|
Experimental |
| 22 |
netlify/nextjs-sentinel
Monitors Next.js releases for relevance to Netlify |
|
Experimental |
| 23 |
Firmislabs/ai-inventory
Detect AI frameworks, LLM dependencies, and model files in any project. One... |
|
Experimental |
| 24 |
lechmazur/debate
Adversarial multi-turn benchmark for LLM debate quality, using side-swapped... |
|
Experimental |
| 25 |
ethicals7s/EchoArena
Local LLM debate arena — make two models battle any topic offline, third... |
|
Experimental |
| 26 |
Rak2k6/gig-audit
Fair Gig Guardian is an AI-powered platform that analyzes gig economy... |
|
Experimental |
| 27 |
gregorydouglasquarles/lavaflow-site
AI-driven predictive safety ecosystem (qView) and multimedia architecture.... |
|
Experimental |
| 28 |
VAMP-NEER/release-radar-ai
🔍 Ultimate Free GitHub Trend Tracker 2026 🚀 | AI-Powered Repo & Dev Team... |
|
Experimental |
| 29 |
samuel-dobrancin-qa/ai-content-quality-framework
Structured framework for evaluating AI content generator quality across six... |
|
Experimental |
| 30 |
SuperAdam47/GabyT-frontend
GabyT is LLM platform with OpenAI |
|
Experimental |
| 31 |
andronov04/aiarena
Client-side AI arena for comparing 1000+ models across 68+ providers. No... |
|
Experimental |
| 32 |
Swap-24/ARGUS
Real-time AI debate arena. Argue live against opponents while an ML pipeline... |
|
Experimental |
| 33 |
Gliangquan/awesome-ai-radar
Daily curated AI, LLM, and agent project radar from GitHub |
|
Experimental |
| 34 |
Privacy-Engineering-CMU/ai-risk-prettified
A prettified page for MIT's AI Risk Database |
|
Experimental |
| 35 |
chankeypathak/AuditSync-Pro
Gen AI application that automatically compares and analyzes audit reports... |
|
Experimental |
| 36 |
ericy-eth/debatr
Debatr enables users to quickly generate quality debate speeches tailored to... |
|
Experimental |