AI Debate Arenas LLM Tools

Interactive platforms where AI models or humans compete in structured debates with scoring/judging. Includes multi-model comparison tools, live debate staging, and adversarial benchmarking. Does NOT include general competitive game frameworks, coding competitions, or security red-team tools without debate mechanics.

There are 36 ai debate arenas tools tracked. 1 score above 50 (established tier). The highest-rated is betagouv/ComparIA at 52/100 with 63 stars.

Get all 36 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=llm-tools&subcategory=ai-debate-arenas&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 betagouv/ComparIA

Open source LLM arena created by the French Government

52
Established
2 Skytliang/Multi-Agents-Debate

MAD: The first work to explore Multi-Agent Debate with Large Language Models :D

49
Emerging
3 liuxiaotong/ai-dataset-radar

Multi-source async competitive intelligence engine for AI training data...

48
Emerging
4 Arnoldlarry15/ARES-Dashboard

AI Red Team Operations Console

44
Emerging
5 llm-ring/lmring

Open-source, self-hostable LLM arena with model compare, voting, and leaderboards

41
Emerging
6 YerbaPage/SWE-Debate

SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution

35
Emerging
7 khoren93/ai-debates

Orchestrate epic battles between 600+ AI models (GPT-5, Gemini 3, DeepSeek...

35
Emerging
8 rd-serendipity/ai-debate-arena

AI Debate Arena: Streamlit app for AI model debates. Features multi-model...

33
Emerging
9 debate/debate-ai.com

Debate AI enables debaters in PF, LD & Policy to streamline AI research...

30
Emerging
10 jhammant/agent-drift

Stress-test AI agents for goal drift and system prompt violations. Inspired...

28
Experimental
11 the-meta-value/The-Perfect-Storm

tracking AI system failures with solar weather

26
Experimental
12 aramirez-maza/beto-framework

BETO formalizes the ignorance of an AI. Materializes raw semantic intent...

26
Experimental
13 lukeslp/consensus

Multi-model research debate engine — 8+ language models independently...

25
Experimental
14 SurgeCLI/Surge

Surge is a lightweight self-learning AI observability and remediation agent...

23
Experimental
15 dinesh-git17/debate-lab

Watch AI models debate any topic in real-time. ChatGPT and Grok argue,...

22
Experimental
16 13120740298z-lang/AI-Tech-Radar

🤖 Automated AI technology intelligence platform — tracks GitHub AI projects,...

22
Experimental
17 qingni/TechSentry

🛡️ AI-powered tech intelligence tool that auto-tracks GitHub repos, Hacker...

22
Experimental
18 sanifhimani/llm-colosseum

AI models fight each other in a pixel arena every day. They decide what to...

22
Experimental
19 armsp/AIFU

AI flub ups

21
Experimental
20 konradhy/battlearena

A proof of concept illustrating how AI can enhance games

21
Experimental
21 florykhan/TelusGuardAI

AI-powered network impact analyzer. Natural-language queries → multi-agent...

21
Experimental
22 netlify/nextjs-sentinel

Monitors Next.js releases for relevance to Netlify

21
Experimental
23 Firmislabs/ai-inventory

Detect AI frameworks, LLM dependencies, and model files in any project. One...

19
Experimental
24 lechmazur/debate

Adversarial multi-turn benchmark for LLM debate quality, using side-swapped...

18
Experimental
25 ethicals7s/EchoArena

Local LLM debate arena — make two models battle any topic offline, third...

17
Experimental
26 Rak2k6/gig-audit

Fair Gig Guardian is an AI-powered platform that analyzes gig economy...

14
Experimental
27 gregorydouglasquarles/lavaflow-site

AI-driven predictive safety ecosystem (qView) and multimedia architecture....

14
Experimental
28 VAMP-NEER/release-radar-ai

🔍 Ultimate Free GitHub Trend Tracker 2026 🚀 | AI-Powered Repo & Dev Team...

14
Experimental
29 samuel-dobrancin-qa/ai-content-quality-framework

Structured framework for evaluating AI content generator quality across six...

14
Experimental
30 SuperAdam47/GabyT-frontend

GabyT is LLM platform with OpenAI

14
Experimental
31 andronov04/aiarena

Client-side AI arena for comparing 1000+ models across 68+ providers. No...

14
Experimental
32 Swap-24/ARGUS

Real-time AI debate arena. Argue live against opponents while an ML pipeline...

13
Experimental
33 Gliangquan/awesome-ai-radar

Daily curated AI, LLM, and agent project radar from GitHub

13
Experimental
34 Privacy-Engineering-CMU/ai-risk-prettified

A prettified page for MIT's AI Risk Database

13
Experimental
35 chankeypathak/AuditSync-Pro

Gen AI application that automatically compares and analyzes audit reports...

10
Experimental
36 ericy-eth/debatr

Debatr enables users to quickly generate quality debate speeches tailored to...

10
Experimental