CS-EVAL/CS-Eval

CS-Eval is a comprehensive evaluation suite for fundamental cybersecurity models or large language models' cybersecurity ability.

/ 100

Emerging

This toolkit helps cybersecurity professionals and AI developers assess the cybersecurity knowledge and reasoning abilities of large language models (LLMs). You input a cybersecurity model or an LLM, and it provides an evaluation report across 11 major cybersecurity categories and 42 subdomains. This allows you to understand a model's strengths and weaknesses in cybersecurity.

No commits in the last 6 months.

Use this if you need to objectively benchmark and compare the cybersecurity capabilities of different AI models or LLMs for applications like threat intelligence, security operations, or vulnerability analysis.

Not ideal if you are looking for a tool to secure your own systems or detect live threats; this is an evaluation tool, not a security solution.

cybersecurity-evaluation AI-security model-benchmarking threat-intelligence security-operations

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 8 / 25

Maturity 16 / 25

Community 11 / 25

How are scores calculated?

Stars

Forks

Language

—

License

MIT

Featured in

You're Shipping AI You Can't Measure

Higher-rated alternatives

EvolvingLMMs-Lab/lmms-eval

One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks

vibrantlabsai/ragas

Supercharge Your LLM Application Evaluations 🚀

open-compass/VLMEvalKit

Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks

EuroEval/EuroEval

The robust European language model benchmark.

Giskard-AI/giskard-oss

🐢 Open-Source Evaluation & Testing library for LLM Agents

Explore LLM Tools

All categories Trending LLM Tool directory Insights