Boopi7/brain-bench
Source code for
This project helps neuroscience researchers and practitioners evaluate how well large language models (LLMs) can predict outcomes in neuroscience experiments. It takes descriptions of neuroscience research scenarios and provides predictions from various LLMs, allowing you to see if these AI models can anticipate results as effectively as human experts. It's designed for anyone in neuroscience curious about the predictive capabilities of advanced AI.
Use this if you want to understand the current state-of-the-art in LLM performance for predicting neuroscience research outcomes and compare them against human expert intuition.
Not ideal if you are looking for a tool to run your own primary neuroscience experiments or to develop new LLMs for specific neuroscience tasks.
Stars
16
Forks
—
Language
TypeScript
License
MIT
Category
Last pushed
Oct 28, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/Boopi7/brain-bench"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Featured in
Higher-rated alternatives
sierra-research/tau2-bench
τ²-Bench: Evaluating Conversational Agents in a Dual-Control Environment
xlang-ai/OSWorld
[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
bigcode-project/bigcodebench
[ICLR'25] BigCodeBench: Benchmarking Code Generation Towards AGI
THUDM/AgentBench
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
scicode-bench/SciCode
A benchmark that challenges language models to code solutions for scientific problems