SDV and SDGym
SDGym is a benchmarking framework that evaluates and compares synthetic data generation methods, making it a complement to SDV that enables practitioners to assess SDV's performance against alternative approaches.
About SDV
sdv-dev/SDV
Synthetic data generation for tabular data
This project helps data professionals create artificial datasets that statistically resemble their real-world tabular data, like customer records or transaction logs. You input your original sensitive data and it outputs a new, entirely fake dataset that maintains the essential patterns and relationships without exposing any private information. This is ideal for data scientists, analysts, and researchers who need to share or develop with data while adhering to privacy regulations.
About SDGym
sdv-dev/SDGym
Benchmarking synthetic data generation methods.
This tool helps data practitioners evaluate and compare different methods for creating synthetic datasets. You input various synthetic data generation models and your original datasets, and it outputs detailed reports on performance, memory usage, and the quality and privacy of the generated synthetic data. Data scientists and machine learning engineers who work with sensitive or limited real-world data would find this useful.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work