superagent-ai/poker-eval
A comprehensive tool for assessing AI Agents performance in simulated poker environments
This tool helps evaluate the performance of different AI models or agents in simulated No-Limit Texas Hold'em poker games. You provide the AI agents you want to test, and the system simulates thousands of hands, generating detailed performance data like profit per hand. It's designed for researchers and developers who need to objectively benchmark AI decision-making capabilities in complex, uncertain environments.
No commits in the last 6 months. Available on npm.
Use this if you are developing or comparing AI agents and need a standardized, robust way to measure their strategic performance in poker.
Not ideal if you are looking for a poker game simulator for human players or a tool to analyze human poker strategies.
Stars
21
Forks
4
Language
TypeScript
License
—
Category
Last pushed
Nov 27, 2024
Commits (30d)
0
Dependencies
1
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/superagent-ai/poker-eval"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Featured in
Higher-rated alternatives
EvolvingLMMs-Lab/lmms-eval
One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
vibrantlabsai/ragas
Supercharge Your LLM Application Evaluations 🚀
open-compass/VLMEvalKit
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
EuroEval/EuroEval
The robust European language model benchmark.
Giskard-AI/giskard-oss
🐢 Open-Source Evaluation & Testing library for LLM Agents