InternScience/ResearchClawBench

ResearchClawBench: Evaluating AI Agents for Automated Research from Re-Discovery to New-Discovery

28
/ 100
Experimental

This tool helps researchers, scientists, and analysts evaluate how well AI agents can conduct scientific research independently. You provide an AI agent with raw scientific data and research goals, and the tool produces a publication-quality report, along with a rigorous evaluation against human-authored scientific papers. It's designed for anyone working with AI agents who needs to verify their ability to perform complex scientific tasks, from data analysis to report generation.

Use this if you need to objectively measure an AI agent's capability to perform real-world scientific research, generate reports, and compare its findings against established scientific literature.

Not ideal if you are looking for a tool to simply test basic coding abilities or factual recall, rather than comprehensive, autonomous scientific inquiry.

scientific-research AI-evaluation data-analysis academic-publishing experimental-science
No Package No Dependents
Maintenance 13 / 25
Adoption 6 / 25
Maturity 9 / 25
Community 0 / 25

How are scores calculated?

Stars

19

Forks

Language

Jupyter Notebook

License

MIT

Last pushed

Mar 21, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/agents/InternScience/ResearchClawBench"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.