sgoedecke/ai-poker-arena

Making multiple LLMs play Texas Holdem against each other

20
/ 100
Experimental

This tool helps AI researchers and developers evaluate the strategic capabilities of different large language models (LLMs) by having them play Texas Hold'em poker against each other. You input various LLMs, and the system simulates poker games, providing insights into which models demonstrate superior strategic decision-making in an adversarial environment. This is ideal for those who need a novel way to benchmark AI model performance beyond traditional benchmarks or human voting.

No commits in the last 6 months.

Use this if you are an AI researcher or developer looking for an objective, adversarial method to compare the strategic reasoning and decision-making of different LLMs.

Not ideal if you need to evaluate LLMs for tasks that don't involve adversarial strategy, like creative writing, summarization, or factual question-answering.

AI-evaluation LLM-benchmarking Adversarial-AI Model-comparison Strategic-AI
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 5 / 25
Maturity 8 / 25
Community 7 / 25

How are scores calculated?

Stars

11

Forks

1

Language

JavaScript

License

Last pushed

Feb 03, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/agents/sgoedecke/ai-poker-arena"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.