alibaba/feathub
FeatHub - A stream-batch unified feature store for real-time machine learning
FeatHub helps data scientists and ML engineers prepare data for machine learning models. It takes raw data from various sources and processes it into 'features'—the specific data points models use for training and predictions. This simplifies the process of creating, deploying, and managing these features across different environments.
347 stars. No commits in the last 6 months. Available on PyPI.
Use this if you are a data scientist or ML engineer who needs to efficiently create, manage, and deploy features for machine learning models, especially across both real-time and batch processing systems.
Not ideal if you are looking for a no-code solution or if your primary focus is on data visualization or basic business intelligence rather than machine learning feature engineering.
Stars
347
Forks
59
Language
Python
License
Apache-2.0
Category
Last pushed
May 27, 2024
Commits (30d)
0
Dependencies
10
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/data-engineering/alibaba/feathub"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Related tools
mage-ai/mage-ai
🧙 Build, run, and manage data pipelines for integrating and transforming data.
vaexio/vaex
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of...
mindsdb/dbt-mindsdb
dbt adapter for connecting to MindsDB
kevin-hanselman/dud
A lightweight CLI tool for versioning data alongside source code and building data pipelines.
Bread-Technologies/Bread-Dataset-Viewer
VS Code extension to easily view and handle large datasets. Look at JSONL/Parquet/CSV files...