Mystique Unicorn App is a building new application based on microservice architectural pattern. One of the services used by teh app is exposed as an ReST API does machine learning inference. This particular ML model and its depedent libraries need about 3GB of storage space. The dev team had been using lambda for most of their APIs and exposing them using Amazon API Gatway. They are interested in utilizing the same compute & gateway services for this ML api as well.
Currently(Q3 2020), the Lambda has only 500MB
of temporary space available and about 250MB
for unzipped layers. Re:Invent might changes these limites, But the teams is really on keen on getting started now.
Can you help them do that in Amazon API Gateway & AWS Lambda?
Amazon EFS is a fully managed shared file system that can be attached to a Lambda functions. This allows developers to easily build and import large code libraries directly into your Lambda functions, share data across function invocations. As the files in EFS is loaded dynamically during function invocation, you can also ensure that the latest version of these libraries is always used by every new execution environment.
In this article, we will build an architecture, similar to the one shown above. To bootstrap our EFS with machine learning libraries and models, We will be using an EC2 machine. Once the process of installing and configuring EFS, the EC2 machine can be terminated.
For the machine learning part, we will be using a pre-trained model open sourced by @nicolalandro available in PyTorch Hub. This model classifies birds using a fine-grained image classifier. We will deploy this model in EFS. When we send the url of the image to the model, it will return us the bird spcies(broadly speaking).
-
This demo, instructions, scripts and cloudformation template is designed to be run in
us-east-1
. With few modifications you can try it out in other regions as well(Not covered here).-
🛠 AWS CLI Installed & Configured - Get help here
-
🛠 AWS CDK Installed & Configured - Get help here
-
🛠 Python Packages, Change the below commands to suit your OS, the following is written for amzn linux 2
- Python3 -
yum install -y python3
- Python Pip -
yum install -y python-pip
- Virtualenv -
pip3 install virtualenv
NOTE: Given that we are planning to machine learning inferences using Lambda, the lambda function needs enough compute and memory to return a response in reasonable time. The automation in this repo, sets up lambda with
3008MB
memory and5 Minutes
timeout. In addition to that, we will also be configuringProvisioned Concurrency
2 for our lambda function to avoid cold starts.Obviously, there has been no attempt made to optimize these settings, as this just a technology demonstration. Given the above reasons and other resources like EC2, please be mindful of the costs involved in deploying and learning from this stack.
- Python3 -
-
-
-
Get the application code
git clone https://github.com/miztiik/serverless-machine-learning-api cd serverless-machine-learning-api
-
-
We will cdk to be installed to make our deployments easier. Lets go ahead and install the necessary components.
# If you DONT have cdk installed npm install -g aws-cdk # Make sure you in root directory python3 -m venv .env source .env/bin/activate pip3 install -r requirements.txt
The very first time you deploy an AWS CDK app into an environment (account/region), you’ll need to install a
bootstrap stack
, Otherwise just go ahead and deploy usingcdk deploy
.cdk bootstrap cdk ls # Follow on screen prompts
You should see an output of the available stacks,
vpc-stack efs-stack pytorch-on-efs serverless-machine-learning-api
-
Let us walk through each of the stacks,
-
Stack: efs-stack We are going to create an EFS share and also create an
/ml
access point that will be used by our lambda function. We also need an VPC to host our EFS, the dependent stackvpc-stack
will be automatically deployed for you. This stack will also set theAcl
&PosixUser
as1000
.To enable communication to our EFS, we will also setup an exclusive security group that allows port
2049
connections overTCP
from any ip within the VPC. This will allow any EC2 instance and lambda functions within the VPC to read and write to our file share.Initiate the deployment with the following command,
cdk deploy vpc-stack efs-stack
-
Stack: pytorch-on-efs To bootstrap our EFS with the machine learning library and models, we need an instance that can write to our EFS share. We will be using an EC2 instance and the
user_data
script to automatically download and install the libraries. The script will installtorch
torchvision
andnumpy
. The ML model will be downloaded from PyTorch Hub3Initiate the deployment with the following command,
cdk deploy pytorch-on-efs
-
Stack: serverless-machine-learning-api
At this point, we are all set to configure our machine learning inference api using AWS Lambda and expose it using API Gateway. This stack:serverless-machine-learning-api do just that for us. It will create the lambda function inside the same VPC as our EFS share. The EFS share will be available for lambda at this mount point
/mnt/inference
. The path for the model and the dependent libraries are set as envionrment variables,PYTHONPATH
:/mnt/inference/lib
TORCH_HOME
:/mnt/inference/model
Since we are also looking to avoid cold starts, the stack will create a versioned lambda and enable a provisioned concurrency of
1
.Initiate the deployment with the following command,
cdk deploy serverless-machine-learning-api
Check the
Outputs
section of the stack to access theMachineLearningInferenceApiUrl
-
-
We can use a tool like
curl
orPostman
to query the urls. The Outputs section of the respective stacks has the required information on the urls.$ WELL_ARCHICTED_API_URL="https://r4e3y68p11.execute-api.us-east-1.amazonaws.com/prod/serverless-machine-learning-api/greeter" $ curl ${WELL_ARCHICTED_API_URL} { "message": "Hello from Miztiikal World, How is it going?", "api_stage": "prod", "lambda_version": "38", "ts": "2020-08-26 13:03:19.810150" }
We need to append the image url as a query string. Here, couple of sample images of birds(Courstesy of wikimedia5). Update the
ML_API_URL
and try it out. You can try with other bird images that are publicly accessible.$ ML_API_URL="https://ace17f0y9c.execute-api.us-east-1.amazonaws.com/prod/ml-api/identify-bird-species" IMG_URL_1="https://upload.wikimedia.org/wikipedia/commons/d/d2/Western_Grebe_swimming.jpg" IMG_URL_2="https://upload.wikimedia.org/wikipedia/commons/b/b5/House_Sparrow_%28Passer_domesticus%29-_Male_in_Kolkata_I_IMG_5904.jpg"
time curl ${ML_API_URL}?url=${IMG_URL_1}
Expected Output,
{ "message": "{'bird_class': '053.Western_Grebe'}", "lambda_version": "14", "ts": "2020-09-07 17:47:58.469903" } real 0m27.570s user 0m0.015s sys 0m0.016s
time curl ${ML_API_URL}?url=${IMG_URL_2}
Expected Output,
{ "message": "{'bird_class': '118.House_Sparrow'}", "lambda_version": "14", "ts": "2020-09-07 17:49:46.138871" } real 0m2.645s user 0m0.020s sys 0m0.032s
It is possible that the first invocation takes slightly longer(even maybe timing out at API GW) as the function has initialize with libraries and models from EFS. Subsequent invocations should be significantly lower at around
~ 3 seconds
.Additional Learnings: You can check the logs in cloudwatch for more information or increase the logging level of the lambda functions by changing the environment variable from
INFO
toDEBUG
-
Here we have demonstrated how to use EFS share with Lambda as a persistent storage. Here are few other use cases that you can try with the same pattern,
- Media processing with
ffmpeg
: For example - Keyframe extraction for highlights etc., - Custom machine learning: For example use
OpenCV
to process of media
- Media processing with
-
If you want to destroy all the resources created by the stack, Execute the below command to delete the stack, or you can delete the stack from console as well
- Resources created during Deploying The Application
- Delete CloudWatch Lambda LogGroups
- Any other custom resources, you have created for this demo
# Delete from cdk cdk destroy # Follow any on-screen prompts # Delete the CF Stack, If you used cloudformation to deploy the stack. aws cloudformation delete-stack \ --stack-name "MiztiikAutomationStack" \ --region "${AWS_REGION}"
This is not an exhaustive list, please carry out other necessary steps as maybe applicable to your needs.
This repository aims to teach how to use persistent storage with serverless microservices running on AWS Lambda to new developers, Solution Architects & Ops Engineers in AWS. Based on that knowledge these Udemy course #1, course #2 helps you build complete architecture in AWS.
Thank you for your interest in contributing to our project. Whether it's a bug report, new feature, correction, or additional documentation or solutions, we greatly value feedback and contributions from our community. Start here
Buy me a coffee ☕.
Level: 300