sdv-dev/SDV

Synthetic data generation for tabular data

81
/ 100
Verified

This project helps data professionals create artificial datasets that statistically resemble their real-world tabular data, like customer records or transaction logs. You input your original sensitive data and it outputs a new, entirely fake dataset that maintains the essential patterns and relationships without exposing any private information. This is ideal for data scientists, analysts, and researchers who need to share or develop with data while adhering to privacy regulations.

3,439 stars. Used by 5 other packages. Actively maintained with 38 commits in the last 30 days. Available on PyPI.

Use this if you need to generate realistic, anonymized datasets from your existing sensitive tabular data for development, testing, or sharing without compromising privacy.

Not ideal if you require absolutely random, statistically unrelated data, or if your data is not in a structured tabular format.

data-anonymization privacy-compliance data-generation data-analysis machine-learning-development
Maintenance 20 / 25
Adoption 15 / 25
Maturity 25 / 25
Community 21 / 25

How are scores calculated?

Stars

3,439

Forks

417

Language

Python

License

Last pushed

Mar 12, 2026

Commits (30d)

38

Dependencies

14

Reverse dependents

5

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/generative-ai/sdv-dev/SDV"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.