apartresearch/3cb

3cb: Catastrophic Cyber Capabilities Benchmarking of Large Language Models

29
/ 100
Experimental

This project helps cybersecurity professionals evaluate whether advanced AI agents possess autonomous hacking capabilities. It takes a collection of carefully designed cyber security challenges and a large language model's API keys as input. The output is a benchmark of the model's performance in solving these challenges, indicating its proficiency in offensive cyber operations. It is intended for AI safety researchers, cybersecurity strategists, and national security analysts concerned with the potential risks of AI.

No commits in the last 6 months.

Use this if you need to rigorously assess the offensive cyber capabilities of large language models against a diverse set of real-world-inspired challenges.

Not ideal if you are looking for a defensive AI tool or a general-purpose security scanner.

AI safety cybersecurity evaluation red teaming national security threat intelligence
No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 6 / 25
Maturity 8 / 25
Community 15 / 25

How are scores calculated?

Stars

15

Forks

4

Language

Python

License

Category

ai-red-teaming

Last pushed

Oct 30, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/apartresearch/3cb"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.