for0nething/RECON
Coresets over Multiple Tables for Feature-rich and Data-efficient Machine Learning
This tool helps data scientists and machine learning engineers build predictive models faster and more efficiently when dealing with large, complex datasets spread across multiple tables. It takes your raw, multi-table data and outputs a smaller, representative 'coreset' that preserves the key characteristics of the original data. This coreset can then be used for training classification or regression models, saving significant computation time and resources.
No commits in the last 6 months.
Use this if you need to train machine learning models on very large datasets composed of many joined tables, and you want to reduce training time and computational cost without sacrificing model accuracy.
Not ideal if your datasets are small, or if your machine learning workflow does not involve complex joins across multiple data sources.
Stars
15
Forks
3
Language
Python
License
MIT
Category
Last pushed
Oct 05, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/for0nething/RECON"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
feature-engine/feature_engine
Feature engineering and selection open-source Python library compatible with sklearn.
alteryx/featuretools
An open source python library for automated feature engineering
cod3licious/autofeat
Linear Prediction Model with Automated Feature Engineering and Selection Capabilities
abess-team/abess
Fast Best-Subset Selection Library
rodrigo-arenas/Sklearn-genetic-opt
ML hyperparameters tuning and features selection, using evolutionary algorithms.