In this lab we will introduce you to Amazon Sagemaker using the Amaazon-provided Linear Learner algorithm to perform binary classification of images of handwritten digits from the MNIST Database. Specifically, we'll train a model to identify whether or not a digit is a "0". In doing so, we will demonstrate how to use a Jupyter notebook and the SageMaker Python SDK to create a script to pre-process data, train a model, create a SageMaker hosted endpoint, and make predictions against this endpoint - completing a full machine learning workflow end-to-end.
- In your notebook instance, click on the top level folder.
- Navigate to
sample-notebooks / introduction_to_amazon_algorithms / linear_learner_mnist
- Open the
linear_learner_mnist.ipynb
notebook, the follow the directions in the notebook. - In the
bucket = '<your_s3_bucket_name_here>'
code line, paste the name of the S3 bucket you created in Module 1 to replace<your_s3_bucket_name_here>
. The code line should now read similar tobucket = 'smworkshop-john-smith'
. Do NOT paste the entire path (s3://.......), just the bucket name.
NOTE: training the model for this example typically takes about 5 minutes
- How good is the model? Compute precision, recall, and f1 metrics to find out.
- Re-train the model to identify an other digit.
- Try changing the classification algorithm (e.g. to a factorization machine) and repeating the workflow