Skip to content

Latest commit

 

History

History
117 lines (85 loc) · 5.87 KB

File metadata and controls

117 lines (85 loc) · 5.87 KB

SageMaker Notebook Instance Lifecycle Config Samples

Overview

A collection of sample scripts to customize Amazon SageMaker Notebook Instances using Lifecycle Configurations

Lifecycle Configurations provide a mechanism to customize Notebook Instances via shell scripts that are executed during the lifecycle of a Notebook Instance.

Sample Scripts

  • auto-stop-idle - This script stops a SageMaker notebook once it's idle for more then 1 hour. (default time)
  • connect-emr-cluster - This script connects an EMR cluster to the Notebook Instance using SparkMagic.
  • install-conda-package-all-environments - is script installs a single conda package in all SageMaker conda environments, apart from the JupyterSystemEnv which is a system environment reserved for Jupyter.
  • install-conda-package-single-environment - This script installs a single conda package in a single SageMaker conda environments.
  • install-lab-extension - This script installs a jupyterlab extension package in SageMaker Notebook Instance.
  • install-nb-extension - This script installs a single jupyter notebook extension package in SageMaker Notebook Instance.
  • install-pip-package-all-environments - This script installs a single pip package in all SageMaker conda environments, apart from the JupyterSystemEnv which is a system environment reserved for Jupyter.
  • install-pip-package-single-environment - This script installs a single pip package in a single SageMaker conda environments.
  • install-server-extension - This script installs a single jupyter notebook server extension package in SageMaker Notebook Instance.
  • mount-efs-file-system - This script mounts an EFS file system to the Notebook Instance at the ~/SageMaker/efs directory based off the DNS name.
  • persistent-conda-ebs - This script installs a custom, persistent installation of conda on the Notebook Instance's EBS volume, and ensures that these custom environments are available as kernels in Jupyter.
  • proxy-for-jupyter - This script configures proxy settings for your Jupyter notebooks and the SageMaker Notebook Instance.
  • publish-instance-metrics - This script publishes the system-level metrics from the Notebook Instance to Cloudwatch.

Development

For contributors looking to develop scripts, they can be developed directly on SageMaker Notebook Instances since that is the environment that they are run with. Lifecycle Configuration scripts run as root, the working directory is /. To simulate the execution environment, you may use

sudo su
export PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin
cd /

Edit the script in a file such as my-script-on-start.sh and execute it as

sh my-script-on-start.sh

The directory structure followed is:

scripts/
    my-script-name/
        on-start.sh
        on-create.sh

Testing

To test the script end-to-end:

Create a Lifecycle Configuration with the script content and a Notebook Instance with the Lifecycle Configuration

# If the scripts are in a directory "scripts/my-script-name/*"
SCRIPT_NAME=my-script-name
ROLE_ARN=my-role-arn

RESOURCE_NAME="$SCRIPT_NAME-$RANDOM"

# Add any script specific options such as subnet-id
aws sagemaker create-notebook-instance-lifecycle-config \
    --notebook-instance-lifecycle-config-name "$RESOURCE_NAME" \
    --on-start Content=$((cat scripts/$SCRIPT_NAME/on-start.sh || echo "")| base64) \
    --on-create Content=$((cat scripts/$SCRIPT_NAME/on-create.sh || echo "")| base64)

aws sagemaker create-notebook-instance \
    --notebook-instance-name "$RESOURCE_NAME" \
    --instance-type ml.t2.medium \
    --role-arn "$ROLE_ARN" \
    --lifecycle-config-name "$RESOURCE_NAME"

aws sagemaker wait \
    notebook-instance-in-service \
    --notebook-instance-name "$RESOURCE_NAME"
  • Access the Notebook Instance and perform any validation specific to the script.
aws sagemaker create-presigned-notebook-instance-url \
    --notebook-instance-name "$RESOURCE_NAME"
aws sagemaker stop-notebook-instance \
    --notebook-instance-name "$RESOURCE_NAME"

aws sagemaker wait \
    notebook-instance-stopped \
    --notebook-instance-name "$RESOURCE_NAME"

aws sagemaker start-notebook-instance \
    --notebook-instance-name "$RESOURCE_NAME"

aws sagemaker wait \
    notebook-instance-in-service \
    --notebook-instance-name "$RESOURCE_NAME"
  • Access the Notebook Instance again and perform any validation specific to the script.
aws sagemaker create-presigned-notebook-instance-url \
    --notebook-instance-name "$RESOURCE_NAME"

File a Pull Request following the instructions in the Contribution Guidelines.

License Summary

This sample code is made available under the MIT-0 license. See the LICENSE file.