aws/sagemaker-spark

A Spark library for Amazon SageMaker.

51
/ 100
Established

This tool helps data scientists and machine learning engineers easily integrate their large-scale data processing workflows with Amazon SageMaker's machine learning capabilities. You can feed your Spark DataFrames into SageMaker for training using either Amazon's built-in algorithms or your own custom models, and then get predictions back on Spark DataFrames from the deployed SageMaker models. It's designed for those managing big data machine learning pipelines.

301 stars. No commits in the last 6 months.

Use this if you need to train machine learning models at scale using Amazon SageMaker directly from your Apache Spark applications, or if you want to deploy and get predictions from SageMaker models within your Spark pipelines.

Not ideal if you are not using Apache Spark for data processing, or if you prefer to manage your machine learning workflows entirely outside of the Amazon SageMaker ecosystem.

big-data-processing machine-learning-engineering cloud-ml-training data-science-pipelines predictive-modeling
Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 25 / 25

How are scores calculated?

Stars

301

Forks

128

Language

Scala

License

Apache-2.0

Last pushed

Mar 08, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/aws/sagemaker-spark"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.