aia39/Synthetic-Tabular-Data-Generation-using-CTGAN-and-classify-with-XGboost
This is the repository to generate synthetic tabular data when the tabular data has imbalance in some feature.
This tool helps data analysts and researchers create artificial data entries that mimic the patterns of your existing spreadsheet-like data. It takes your raw tabular data, especially when certain categories are underrepresented, and generates new, synthetic rows. This allows you to work with a larger, more balanced dataset for tasks like predictive modeling or statistical analysis.
No commits in the last 6 months.
Use this if you have an imbalanced dataset where one outcome or category has significantly fewer examples than others, and you need more data for robust analysis or model training.
Not ideal if your primary concern is data privacy and you need to generate synthetic data without any direct reference to existing sensitive information.
Stars
7
Forks
1
Language
Jupyter Notebook
License
—
Category
Last pushed
Jun 11, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/aia39/Synthetic-Tabular-Data-Generation-using-CTGAN-and-classify-with-XGboost"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Diyago/Tabular-data-generation
We well know GANs for success in the realistic image generation. However, they can be applied in...
meta-llama/synthetic-data-kit
Tool for generating high quality Synthetic datasets
Data-Centric-AI-Community/ydata-synthetic
Synthetic data generators for tabular and time-series data
tdspora/syngen
Open-source version of the TDspora synthetic data generation algorithm.
vanderschaarlab/synthcity
A library for generating and evaluating synthetic tabular data for privacy, fairness and data...