kayua/MalDataGen
MalDataGen is an advanced Python framework for generating and evaluating synthetic tabular datasets using modern generative models, including diffusion and adversarial architectures.
This tool helps cybersecurity researchers and practitioners generate realistic, synthetic tabular datasets specifically for malware detection. You input an existing dataset of malware characteristics, and it outputs new, artificial datasets that mimic the statistical properties of the original. This allows security professionals to augment their training data and evaluate detection models more effectively.
Use this if you need to create diverse and high-quality synthetic tabular data to improve the training and evaluation of your malware detection systems.
Not ideal if you are looking for a general-purpose synthetic data generator for domains other than cybersecurity or for non-tabular data types.
Stars
43
Forks
7
Language
Python
License
MIT
Category
Last pushed
Dec 26, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/generative-ai/kayua/MalDataGen"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
sdv-dev/SDV
Synthetic data generation for tabular data
sdv-dev/SDGym
Benchmarking synthetic data generation methods.
NVIDIA-NeMo/DataDesigner
🎨 NeMo Data Designer: A general library for generating high-quality synthetic data from scratch...
AlexanderVNikitin/tsgm
Generation and evaluation of synthetic time series datasets (also, augmentations,...
mostly-ai/mostlyai
Synthetic Data SDK ✨