lucagioacchini/auto-pen-bench

This repo contains the codes of the penetration test benchmark for Generative Agents presented in the paper "AutoPenBench: Benchmarking Generative Agents for Penetration Testing". It contains also the instructions to install, develop and test new vulnerable containers to include in the benchmark.

49
/ 100
Emerging

This benchmark helps cybersecurity researchers and professionals evaluate how well AI-powered penetration testing agents can find vulnerabilities in simulated systems. It takes a generative AI agent and a definition of a vulnerable machine, then reports on the agent's ability to identify and exploit weaknesses. Security analysts, red teamers, and AI researchers focused on offensive security would use this to assess agent performance.

Use this if you are developing or evaluating generative AI agents for automated penetration testing and need a standardized way to measure their effectiveness against various vulnerabilities.

Not ideal if you are looking for a tool to perform actual penetration tests on live production systems or if you are not working with generative AI agents.

penetration-testing red-teaming vulnerability-assessment cybersecurity-research generative-AI-security
No Package No Dependents
Maintenance 6 / 25
Adoption 8 / 25
Maturity 16 / 25
Community 19 / 25

How are scores calculated?

Stars

68

Forks

20

Language

Python

License

MIT

Last pushed

Oct 28, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/generative-ai/lucagioacchini/auto-pen-bench"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.