sagemaker-python-sdk and sagemaker-spark
These are ecosystem siblings serving different integration points—the Python SDK provides direct SageMaker access for ML workflows, while the Spark library enables SageMaker integration within distributed Spark data processing pipelines.
About sagemaker-python-sdk
aws/sagemaker-python-sdk
A library for training and deploying machine learning models on Amazon SageMaker
This is a Python library that helps machine learning engineers and data scientists train and deploy models on Amazon SageMaker. It simplifies the process of getting your data (from S3) into a training environment and then taking the trained model to make predictions. You can use popular frameworks like PyTorch or MXNet, or bring your own custom algorithms.
About sagemaker-spark
aws/sagemaker-spark
A Spark library for Amazon SageMaker.
This tool helps data scientists and machine learning engineers easily integrate their large-scale data processing workflows with Amazon SageMaker's machine learning capabilities. You can feed your Spark DataFrames into SageMaker for training using either Amazon's built-in algorithms or your own custom models, and then get predictions back on Spark DataFrames from the deployed SageMaker models. It's designed for those managing big data machine learning pipelines.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work