Example for using long-running asynchronous activities with AWS Step Functions.
Step functions can orchestrate Lambda function directly, and can also use an Activity - a step whose actual work gets done externally. One of the advantages of activities is the ability to perform long-running tasks. The state machine will hold its execution and wait for an update from the worker who polled and now executes the activity's task. The worker signals the end of this activity step, with either success or failure status.
For demonstrating this, we will use a video ingestion workflow. This automated flow will detect a video file upload to a designated S3 bucket, and trigger a process that uses Amazon Rekognition to check if the video is safe. High level architecture:
So what happens here?
- You upload a file to the Raw bucket
- The bucket is configured to send events when an object is created, which triggers a Lambda function (via an SNS topic, not shown in diagram for simplicity)
- The lambda function starts an execution of the Step Function's state machine
In parallel, we have a CloudWatch schedule rule, that polls our state machine for work on a 1-minute interval. When system is idle, no tasks are available. But after a file upload, a task is available. Note! This is a naive implementation, which only handles one task. A real-life implementation would continuously poll, and execute workers in a different process/thread, ideally with a queue in the middle to decouple the poller from the worker.
What happens now is:
- The Lambda function gets the details about the new video file from the newly available task, and stores the task ID
- It triggers an asynchronous request to Rekognition, to detect if the content is safe (Content Moderation)
- The async response from Rekognition returns via SNS, to trigger the third (and final) Lambda function, which signals our state machine about the end of this task
- Create a new stack in your AWS account under CloudFormation with the provided yaml file of this repo.
- Upload an .mp4 video file to the "RawVideoSourceBucket" bucket.
- From the AWS console open the Step Functions dashboard, and check the state machine execution.
- If the execution was successful, check the "approved" bucket - your video should be copied there!
Deploying this solution will start a scheduler which executes the lambda poller every minute. Since this uses long polling with 60 second timeout, the function is in fact constantly running. Please keep in mind, and remember to delete the stack when done.
The samples in this repository are meant to help users understand the concept of activities in Step Functions. They are not sufficient for production environments. Users should carefully inspect samples before running and/or using them.
Use at your own risk.
- AWS Step Functions - Workflow management and task orchestration
- Amazon S3 - Object storage for any scale
- Amazon Rekognition - Intelligent image and video analysis
- AWS Lambda - Run code without thinking about servers
- Amazon SNS - Pub/Sub messaging
- CloudFormation - Model and set up your Amazon Web Services resources