Amazon Sagemaker notebook, image files, Lambda function for performing inference on DeepLens The running model will identify the primary subject in view and determine whether they are compliant with WorkSite Safety standards for wearing a hardhat.
This model has only been trained with a clean, white hardhat. It will not currently produce reliable results for different coloured hardhats or for hardhats covered with stickers.
Currently I see this as a useful talking point about the need for training data that adequately represents the real-world use case.
The model has been trained at subject eye-level. A real-world deployment should consider the position of the camera (e.g. elevation) and ensure training images are captured at the correct angle
This project is aimed to demonstrate the art of the possible for worksite safety.
Given what has been produced as a learning activity for one person with a couple of thousand images, imagine what is possible for a dedicated team!
The algorithm chosen in this iteration of the project is the Amazon SageMaker built in Image Classification algorithm. This algorithm will look at an entire image and classify it as one of the following classifications:
- "Compliant" - The main subject in the image is wearing their hardhat and is hence compliant with that aspect of worksite safety.
- "Not compliant" - The main subject in the image is not wearing their hardhat and is hence not compliant with that aspect of worksite safety.
- "No Subject" - No subject was found in the image.
- "Unsure" - A subject is in the image but the algorithm is unable to determine compliance. The head may be cropped out of the image or obstucted.
A suggested future piece of work is to look at the new Amazon SageMaker built-in Object Detection algorithm. This would likely produce even better results for subjects standing a variable distance from the camera as well as for images with multiple subjects in view. The data preparation would however require a lot more effort.
The training data used with this model was a set of images of an individual in near full-frame from their head down to around their thighs/knees. Some sample images have been provided in this repo; however, for privacy reasons no photos showing faces have been provided. Photos were taken on an iPhone with the camera held at eye-level. Subjects were asked to turn to each 45 degree point on the compass as well as look left and right at many of these compass points.
Photos were taken with the subject wearing the hard hat as well as a complimentary set of photos without the hardhat, with a baseball cap, or with the hardhat held in front of them or under their arm.
A real world scenario should pay careful attention to the location of the camera (will it be mounted overhead?) in order to gather images taken at the appropriate angle
Photos were first resized and then organised into the directory structure specified below.
To resize the photos, the resize.py program located in this github repo was used. The photos were resized to 1920 x 1080 to match the size of the images pulled from the deeplens, and hence have the same size images for training as for using the trained model. The following inputs were used:
- -i <The directory where the photos are>
- -x <The desired width (we used 1920)>
- -y <The desired height(we used 1080)>
The following organisational structure was then used:
images
⎿ 0_compliant
⎿ <all compliant images>
⎿ 1_notcompliant
⎿ <all non-compliant images>
⎿ 2_unsure
⎿ <all images where head is cropped/obscured>
⎿ 3_nosubject
⎿ <all background/fuzz images>
Next I used opencv to double my dataset by creating a mirror (vertical flip) of every image.
The python3 code used to perform this augmentation is provided as augment_images.py.
SageMaker's Image Classification algorithm has a preferred record type of recordio.
Download im2rec.py from this github repo or install mxnet (im2rec.py is included in the mxnet installation)
Run the following command to create .lst (list) and .idx (index) files for the dataset:
python im2rec.py --list --recursive --train-ratio 0.95 dataset images
Note: If you want to check how many images are in your dataset, use one of the following commands:
Linux/MAC: wc -l dataset_train.lst
Windows: findstr "." dataset_train.lst | find /c/v ""
Once complete, run the following command to create the recordio files which will be used as input for training our model:
python im2rec.py --resize 224 dataset images
Create a S3 bucket in the us-east-1 (North Virginia). The reason is that DeepLens deployments are performed from this region so it is easiest to avoid having to transfer models between regions.
Copy the training and validation recordio files into separate directories in your S3 bucket:
aws s3 cp dataset_train.rec s3://<S3 bucket name>/train/
aws s3 cp dataset_val.rec s3://<S3 bucket name>/validation/
It can be useful to have access to some test images in jpeg format that are already resized to the dimensions expected by your trained model. This will allow you to perform some preliminary testing of your trained model within your notebook before deploying the model to a DeepLens device
Store a handful of test images in a directory on your local machine (in this example the S3 directory is called test).
Copy the images to a directory in your S3 bucket:
aws s3 cp images_test/ s3://<S3 bucket name>/test/ --recursive
Using Amazon SageMaker in us-east-1 (North Virginia) region, create a notebook instance
Upload the notebook provided in this distribution "deeplens-worksite-safety-public.ipynb"
Find the line 'bucket_path="deeplens-your-bucketpath-name"' and change to match your bucket name and optionally a path in the bucket
Note: Your bucket name must start with the word 'deeplens' in order to allow the DeepLens IAM role to access your model.
Find the line 'num_layers = 50'. This is a hyperparameter you need to try different options for. You will need many images to take advantage of greater layers.
Find the line 'image_shape = "3,224,224"'. If you have changed the dimension of the images from 3 channels (RGB) with a maximum of 224x224, you will need to adjust this line
Find the line 'num_training_samples = 5308'. This must match the number of images in your training set
Find the line 'num_classes = 4'. This must match the number of output classifications
Find the line 'job_name_prefix = "your-jobname-prefix"'. Change this name to something relevant to your project
Note: mini_batch_size determines the number of images sent to each GPU at a time, ensure this setting is smaller than your training set.
Execute notebook cells down to but not including the section "Inference"
You now have a trained model!
You could now jump straight to deploying to your DeepLens device; however, the next step runs through some local tests using SageMaker to ensure the model is performing correctly
Find the line 'model_name="your-modelname"' and give it a unique model name
Execute notebook cells down to but not including the section "Download test images"
You now have an inference endpoint hosted by the SageMaker service!
Continue to the next cell in the notebook to download test images from the S3 bucket to this notebook local directory
In the section "Download test images" update the bucketpath to the location of your test images
In the section "Validate Model", find the line 'file_name = "/tmp/test/sample_image1.jpg"' and change the image name to match the image you wish to perform inference on
Execute the cell. Was the result correct? Was the probability nice and high?
Execute the final cell to delete (and stop paying for) the SageMaker hosting endpoint
https://docs.aws.amazon.com/deeplens/latest/dg/deeplens-getting-started-register.html
There are two options provided here:
- Import your own model built using SageMaker
- Import the sample pre-built model provided in this repository
From the DeepLens console, select 'Models'
Select 'Import Model'
From 'Import Source', ensure 'Amazon SageMaker trained model' is selected
From 'Model settings', select the following options:
- Amazon SageMaker training job ID: Choose the TRAINING JOB ID that produced your model (this will likely be the most recent training job)
- Model Name: Select a name that is meaningful to you
- Model Framework: MXNet
From the DeepLens console, select 'Models'
Select 'Import Model'
From 'Import Source', select 'Externally Trained Model'
From 'Model settings', select the following options:
- Model artifact path: Select the S3 bucket path to the location of the model artifacts
Note1: The pathname must start with S3://deeplens)
Note2: In this case the artifact will be model.tar.gz - Model Name: Select a name that is meaningful to you
- Model Framework: MXNet
Follow the instructions provided at: https://docs.aws.amazon.com/deeplens/latest/dg/deeplens-inference-lambda-create.html and create a function called "deeplens-hardhat-detection"
Note: ensure you copy the greengrass-hello-world blueprint as this includes libraries that are required
- Replace all code in the file "greengrassHelloWorld.py" with the code provided in the file "greengrassHHdetect.py"
- In your Lambda environment, change the name of your python function from "greengrassHelloWorld.py" to "greengrassHHdetect.py"
- Ensure that the Lambda function handler is specified as "greengrassHHdetect.function_handler"
Save your function
Publish your function "Actions - Publish new version"
From the DeepLens console, select 'Projects'
Select 'Create new project'
From 'Project type', select 'Create a new blank project'
Select 'Next'
Within 'Project Information', select a Project Name that is meaningful to you
From 'Project Content', select the following options:
- Add model: Select the model you imported
- Add function: Select the Lambda function (at the required version) you published
Select 'Create' to create the project
From the DeepLens console, select 'Projects'
Select the radio button next to your project
Select 'Deploy to device'
Select the radio button next to your desired DeepLens device
Select 'Review'
When ready to deploy, select 'Deploy'
I found it very helpful to purchase a micro-HDMI to HDMI cable so that I could directly display the DeepLens output on a screen. Note that a USB keyboard and mouse will also be necessary if you wish to do this.
https://docs.aws.amazon.com/deeplens/latest/dg/deeplens-viewing-device-output-in-browser.html
I have published an alternate version of the Lambda function and model in the subfolder 'modelv2'.
I am keen to hear your feedback on whether this new release produces better results.
The differences from v1 are:
- To avoid saturating the IOT endpoint, only every 10th inference is sent to the endpoint
- Additional images were added to cover edge cases
- Lambda function is set to infer 'unsure' if neither 'compliant' or 'not compliant' are below 50% confidence
- Color_Crop_Transform setting was enabled during training in order to augment the image set