Home

Awesome

SageMaker

Amazon SageMaker Examples

Example Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using Amazon SageMaker.

:books: Read this before you proceed further

Amazon SageMaker examples are divided in two repositories:

Planning to submit a PR to this repository? read this first:

:hammer_and_wrench: Setup

The quickest setup to run example notebooks includes:

:computer: Usage

These example notebooks are automatically loaded into SageMaker Notebook Instances. They can be accessed by clicking on the SageMaker Examples tab in Jupyter or the SageMaker logo in JupyterLab.

Although most examples utilize key Amazon SageMaker functionality like distributed, managed training or real-time hosted endpoints, these notebooks can be run outside of Amazon SageMaker Notebook Instances with minimal modification (updating IAM role definition and installing the necessary libraries).

As of February 7, 2022, the default branch is named "main". See our announcement for details and how to update your existing clone.

:notebook: Examples

Introduction to geospatial capabilities

These examples introduce SageMaker geospatial capabilities which makes it easy to build, train, and deploy ML models using geospatial data.

Introduction to Ground Truth Labeling Jobs

These examples provide quick walkthroughs to get you up and running with the labeling job workflow for Amazon SageMaker Ground Truth.

Introduction to Applying Machine Learning

These examples provide a gentle introduction to machine learning concepts as they are applied in practical use cases across a variety of sectors.

SageMaker Automatic Model Tuning

These examples introduce SageMaker's hyperparameter tuning functionality which helps deliver the best possible predictions by running a large number of training jobs to determine which hyperparameter values are the most impactful.

SageMaker Autopilot

These examples introduce SageMaker Autopilot. Autopilot automatically performs feature engineering, model selection, model tuning (hyperparameter optimization) and allows you to directly deploy the best model to an endpoint to serve inference requests.

Introduction to Amazon Algorithms

These examples provide quick walkthroughs to get you up and running with Amazon SageMaker's custom developed algorithms. Most of these algorithms can train on distributed hardware, scale incredibly well, and are faster and cheaper than popular alternatives.

Amazon SageMaker RL

The following provide examples demonstrating different capabilities of Amazon SageMaker RL.

Scientific Details of Algorithms

These examples provide more thorough mathematical treatment on a select group of algorithms.

Amazon SageMaker Debugger

These examples provide and introduction to SageMaker Debugger which allows debugging and monitoring capabilities for training of machine learning and deep learning algorithms. Note that although these notebooks focus on a specific framework, the same approach works with all the frameworks that Amazon SageMaker Debugger supports. The notebooks below are listed in the order in which we recommend you review them.

Amazon SageMaker Distributed Training

These examples provide an introduction to SageMaker Distributed Training Libraries for data parallelism and model parallelism. The libraries are optimized for the SageMaker training environment, help adapt your distributed training jobs to SageMaker, and improve training speed and throughput. More examples for models such as BERT and YOLOv5 can be found in distributed_training/.

Amazon SageMaker Smart Sifting

These examples provide an Introduction to Smart Sifting library. Smart Sifting is a framework to speed up training of PyTorch models. The framework implements a set of algorithms that filter out inconsequential training examples during training, reducing the computational cost and accelerating the training process. It is configuration-driven and extensible, allowing users to add custom logic to transform their training examples into a filterable format. Smart sifting provides a generic utility for any DNN model, and can reduce the training cost by up to 35% in infrastructure cost.

Amazon SageMaker Clarify

These examples provide an introduction to SageMaker Clarify which provides machine learning developers with greater visibility into their training data and models so they can identify and limit bias and explain predictions.

Publishing content from RStudio on Amazon SageMaker to RStudio Connect

These examples show you how to run R examples, and publish applications in RStudio on Amazon SageMaker to RStudio Connect.

Advanced Amazon SageMaker Functionality

These examples showcase unique functionality available in Amazon SageMaker. They cover a broad range of topics and utilize a variety of methods, but aim to provide the user with sufficient insight or inspiration to develop within Amazon SageMaker.

Amazon SageMaker Neo Compilation Jobs

These examples provide an introduction to how to use Neo to compile and optimize deep learning models.

Amazon SageMaker Processing

These examples show you how to use SageMaker Processing jobs to run data processing workloads.

Amazon SageMaker Pipelines

These examples show you how to use SageMaker Pipelines to create, automate and manage end-to-end Machine Learning workflows.

Amazon SageMaker Pre-Built Framework Containers and the Python SDK

Pre-Built Deep Learning Framework Containers

These examples show you how to train and host in pre-built deep learning framework containers using the SageMaker Python SDK.

Pre-Built Machine Learning Framework Containers

These examples show you how to build Machine Learning models with frameworks like Apache Spark or Scikit-learn using SageMaker Python SDK.

Using Amazon SageMaker with Apache Spark

These examples show how to use Amazon SageMaker for model training, hosting, and inference through Apache Spark using SageMaker Spark. SageMaker Spark allows you to interleave Spark Pipeline stages with Pipeline stages that interact with Amazon SageMaker.

Using Amazon SageMaker with Amazon Keyspaces (for Apache Cassandra)

These examples show how to use Amazon SageMaker to read data from Amazon Keyspaces.

AWS Marketplace

Create algorithms/model packages for listing in AWS Marketplace for machine learning.

These example notebooks show you how to package a model or algorithm for listing in AWS Marketplace for machine learning.

Once you have created an algorithm or a model package to be listed in the AWS Marketplace, the next step is to list it in AWS Marketplace, and provide a sample notebook that customers can use to try your algorithm or model package.

Use algorithms, data, and model packages from AWS Marketplace.

These examples show you how to use model-packages and algorithms from AWS Marketplace and dataset products from AWS Data Exchange, for machine learning.

Using Amazon SageMaker for Generative AI use cases.

These examples show you how to use AWS services for Generative AI use cases.

:balance_scale: License

This library is licensed under the Apache 2.0 License. For more details, please take a look at the LICENSE file.

:handshake: Contributing

Although we're extremely excited to receive contributions from the community, we're still working on the best mechanism to take in examples from external sources. Please bear with us in the short-term if pull requests take longer than expected or are closed. Please read our contributing guidelines if you'd like to open an issue or submit a pull request.