h2oai/sparkling-water
Sparkling Water provides H2O functionality inside Spark cluster
This tool helps data scientists and machine learning engineers who work with large datasets perform advanced analytics and build machine learning models efficiently. It allows you to combine the data processing power of Apache Spark with the high-performance machine learning algorithms from H2O-3. You provide your structured data within a Spark environment, and it enables you to train and score models using H2O's capabilities, ultimately yielding powerful predictive insights.
977 stars.
Use this if you need to build and deploy machine learning models on very large datasets already managed within an Apache Spark ecosystem.
Not ideal if your datasets are small, or if you are not already using Apache Spark for your data processing.
Stars
977
Forks
362
Language
Scala
License
Apache-2.0
Category
Last pushed
Nov 05, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/data-engineering/h2oai/sparkling-water"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
knime/knime-core
KNIME Analytics Platform
sparklyr/sparklyr
R interface for Apache Spark
apache/wayang
Apache Wayang is the first cross-platform data processing system.
quixio/quix-streams
Python Streaming DataFrames for Kafka
jtablesaw/tablesaw
Java dataframe and visualization library