sgoedecke/ai-poker-arena
Making multiple LLMs play Texas Holdem against each other
This tool helps AI researchers and developers evaluate the strategic capabilities of different large language models (LLMs) by having them play Texas Hold'em poker against each other. You input various LLMs, and the system simulates poker games, providing insights into which models demonstrate superior strategic decision-making in an adversarial environment. This is ideal for those who need a novel way to benchmark AI model performance beyond traditional benchmarks or human voting.
No commits in the last 6 months.
Use this if you are an AI researcher or developer looking for an objective, adversarial method to compare the strategic reasoning and decision-making of different LLMs.
Not ideal if you need to evaluate LLMs for tasks that don't involve adversarial strategy, like creative writing, summarization, or factual question-answering.
Stars
11
Forks
1
Language
JavaScript
License
—
Category
Last pushed
Feb 03, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/agents/sgoedecke/ai-poker-arena"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
google-deepmind/concordia
A library for generative social simulation
Mai-xiyu/Minecraft_AI
AI Play Minecraft
mikelma/craftium
A framework for creating rich, 3D, Minecraft-like single and multi-agent environments for AI...
cocacola-lab/MineLand
Simulating Large-Scale Multi-Agent Interactions with Limited Multimodal Senses and Physical Needs
rezaho/MARSYS
Multi-Agent Reasoning Systems