Icyrockton/MegaVul

MegaVul - The largest, high-quality, extensible, continuously updated, C/C++/Java vulnerability dataset

41
/ 100
Emerging

This project provides a comprehensive dataset of C, C++, and Java code functions, categorized as either vulnerable or non-vulnerable. It helps security researchers and developers train models to automatically detect software vulnerabilities. You input a code function (and optionally its Joern graph representation), and the output is a classification indicating whether that function is vulnerable.

139 stars. No commits in the last 6 months.

Use this if you are a security researcher or software developer building or evaluating machine learning models for automated vulnerability detection in C, C++, or Java codebases.

Not ideal if you need a real-time vulnerability scanner for active codebases or a tool to analyze application security posture without building custom detection models.

software-security vulnerability-detection static-analysis code-auditing security-research
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 15 / 25

How are scores calculated?

Stars

139

Forks

18

Language

Python

License

GPL-3.0

Last pushed

Jan 12, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/Icyrockton/MegaVul"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.