Boopi7/brain-bench

Source code for

/ 100

Experimental

This project helps neuroscience researchers and practitioners evaluate how well large language models (LLMs) can predict outcomes in neuroscience experiments. It takes descriptions of neuroscience research scenarios and provides predictions from various LLMs, allowing you to see if these AI models can anticipate results as effectively as human experts. It's designed for anyone in neuroscience curious about the predictive capabilities of advanced AI.

Use this if you want to understand the current state-of-the-art in LLM performance for predicting neuroscience research outcomes and compare them against human expert intuition.

Not ideal if you are looking for a tool to run your own primary neuroscience experiments or to develop new LLMs for specific neuroscience tasks.

neuroscience research AI in science predictive modeling experimental design cognitive science

No Package No Dependents

Maintenance 6 / 25

Adoption 6 / 25

Maturity 13 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

TypeScript

License

MIT

Featured in

You're Shipping AI You Can't Measure

Higher-rated alternatives

sierra-research/tau2-bench

τ²-Bench: Evaluating Conversational Agents in a Dual-Control Environment

xlang-ai/OSWorld

[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

bigcode-project/bigcodebench

[ICLR'25] BigCodeBench: Benchmarking Code Generation Towards AGI

THUDM/AgentBench

A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)

scicode-bench/SciCode

A benchmark that challenges language models to code solutions for scientific problems

Explore LLM Tools

All categories Trending LLM Tool directory Insights