CS-EVAL/CS-Eval
CS-Eval is a comprehensive evaluation suite for fundamental cybersecurity models or large language models' cybersecurity ability.
This toolkit helps cybersecurity professionals and AI developers assess the cybersecurity knowledge and reasoning abilities of large language models (LLMs). You input a cybersecurity model or an LLM, and it provides an evaluation report across 11 major cybersecurity categories and 42 subdomains. This allows you to understand a model's strengths and weaknesses in cybersecurity.
No commits in the last 6 months.
Use this if you need to objectively benchmark and compare the cybersecurity capabilities of different AI models or LLMs for applications like threat intelligence, security operations, or vulnerability analysis.
Not ideal if you are looking for a tool to secure your own systems or detect live threats; this is an evaluation tool, not a security solution.
Stars
60
Forks
6
Language
—
License
MIT
Category
Last pushed
Nov 27, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/CS-EVAL/CS-Eval"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Featured in
Higher-rated alternatives
EvolvingLMMs-Lab/lmms-eval
One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
vibrantlabsai/ragas
Supercharge Your LLM Application Evaluations 🚀
open-compass/VLMEvalKit
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
EuroEval/EuroEval
The robust European language model benchmark.
Giskard-AI/giskard-oss
🐢 Open-Source Evaluation & Testing library for LLM Agents