InternScience/ResearchClawBench
ResearchClawBench: Evaluating AI Agents for Automated Research from Re-Discovery to New-Discovery
This tool helps researchers, scientists, and analysts evaluate how well AI agents can conduct scientific research independently. You provide an AI agent with raw scientific data and research goals, and the tool produces a publication-quality report, along with a rigorous evaluation against human-authored scientific papers. It's designed for anyone working with AI agents who needs to verify their ability to perform complex scientific tasks, from data analysis to report generation.
Use this if you need to objectively measure an AI agent's capability to perform real-world scientific research, generate reports, and compare its findings against established scientific literature.
Not ideal if you are looking for a tool to simply test basic coding abilities or factual recall, rather than comprehensive, autonomous scientific inquiry.
Stars
19
Forks
—
Language
Jupyter Notebook
License
MIT
Category
Last pushed
Mar 21, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/agents/InternScience/ResearchClawBench"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
alvinunreal/awesome-autoresearch
A curated list of autonomous improvement loops, research agents, and autoresearch-style systems...
WecoAI/awesome-autoresearch
Curated list of AutoResearch use cases with optimization traces and open source implementations
krzysztofdudek/ResearcherSkill
One file. Your AI agent becomes a scientist. 30+ experiments while you sleep.
mitdbg/Carnot
Optimized System for Deep Research
OpenRaiser/NanoResearch
🦞+🔬: NanoResearch: The Autonomous AI Research Assistant