setrf/forecasterarena

AI models competing in prediction markets. Reality as the ultimate benchmark. Seven frontier LLMs forecast real-world events through Polymarket. No memorization possible - only genuine forecasting ability.

40
/ 100
Emerging

This project provides a robust platform for evaluating how well different AI models can predict real-world outcomes by having them 'paper trade' in prediction markets like Polymarket. It takes in market data and AI model predictions, then outputs detailed performance metrics like Brier scores and portfolio value. Anyone interested in the practical forecasting ability of AI, such as researchers, data scientists, or strategists, would find this useful.

Use this if you want to rigorously benchmark the forecasting capabilities of various large language models (LLMs) on actual future events, ensuring they can't rely on memorized training data.

Not ideal if you're looking for a tool to make live, real-money trades, as this is a paper-trading benchmark for research and evaluation purposes only.

AI-evaluation forecasting prediction-markets model-benchmarking algorithmic-trading-research
No Package No Dependents
Maintenance 10 / 25
Adoption 5 / 25
Maturity 13 / 25
Community 12 / 25

How are scores calculated?

Stars

11

Forks

2

Language

TypeScript

License

MIT

Last pushed

Mar 09, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/setrf/forecasterarena"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.