Skip to content

Latest commit

 

History

History
206 lines (140 loc) · 8.56 KB

README.md

File metadata and controls

206 lines (140 loc) · 8.56 KB

About aws-mwaa-local-runner

This repository provides a command line interface (CLI) utility that replicates an Amazon Managed Workflows for Apache Airflow (MWAA) environment locally.

Please note: MWAA/AWS/DAG/Plugin issues should be raised through AWS Support or the Airflow Slack #airflow-aws channel. Issues here should be focused on this local-runner repository.

About the CLI

The CLI builds a Docker container image locally that’s similar to a MWAA production image. This allows you to run a local Apache Airflow environment to develop and test DAGs, custom plugins, and dependencies before deploying to MWAA.

What this repo contains

dags/
  example_dag_with_custom_ssh_plugin.py
  example_dag_with_taskflow_api.py
  tutorial.py
requirements/  
  requirements.txt
docker/
  config/
    airflow.cfg
    constraints.txt
    mwaa-base-providers-requirements.txt
    requirements.txt
    webserver_config.py
    .env.localrunner
  script/
    bootstrap.sh
    entrypoint.sh
    systemlibs.sh
    generate_key.sh
  docker-compose-local.yml
  docker-compose-resetdb.yml
  docker-compose-sequential.yml
  Dockerfile
plugins/
  ssh_plugin.py
.gitignore
CODE_OF_CONDUCT.md
CONTRIBUTING.md
LICENSE
mwaa-local-env
README.md
VERSION

Prerequisites

Get started

git clone https://github.com/aws/aws-mwaa-local-runner.git
cd aws-mwaa-local-runner

Step one: Building the Docker image

Build the Docker container image using the following command:

./mwaa-local-env build-image

Note: it takes several minutes to build the Docker image locally.

Step two: Running Apache Airflow

Local runner

Runs a local Apache Airflow environment that is a close representation of MWAA by configuration.

./mwaa-local-env start

To stop the local environment, Ctrl+C on the terminal and wait till the local runner and the postgres containers are stopped.

To run for another DAG

./mwaa-local-env start -p <absolute_path_to_dag_folder> -r <absolute_path_to_requirements_folder>

where:

  • -p: Absolute path to dag folder
  • -r: AAbdsolute path to requirements folder -p and -r are optional of one/or either not provided the default value is use ($PWD/dag or $PWD/requirements)

Step three: Accessing the Airflow UI

By default, the bootstrap.sh script creates a username and password for your local Airflow environment.

  • Username: admin
  • Password: test

Airflow UI

Step four: Add DAGs and supporting files

The following section describes where to add your DAG code and supporting files. We recommend creating a directory structure similar to your MWAA environment.

DAGs

  1. Add DAG code to the dags/ folder.
  2. To run the sample code in this repository, see the tutorial.py file.

Requirements.txt

  1. Add Python dependencies to requirements/requirements.txt.
  2. To test a requirements.txt without running Apache Airflow, use the following script:
./mwaa-local-env test-requirements

Let's say you add aws-batch==0.6 to your requirements/requirements.txt file. You should see an output similar to:

Installing requirements.txt
Collecting aws-batch (from -r /usr/local/airflow/dags/requirements.txt (line 1))
  Downloading https://files.pythonhosted.org/packages/5d/11/3aedc6e150d2df6f3d422d7107ac9eba5b50261cf57ab813bb00d8299a34/aws_batch-0.6.tar.gz
Collecting awscli (from aws-batch->-r /usr/local/airflow/dags/requirements.txt (line 1))
  Downloading https://files.pythonhosted.org/packages/07/4a/d054884c2ef4eb3c237e1f4007d3ece5c46e286e4258288f0116724af009/awscli-1.19.21-py2.py3-none-any.whl (3.6MB)
    100% |████████████████████████████████| 3.6MB 365kB/s 
...
...
...
Installing collected packages: botocore, docutils, pyasn1, rsa, awscli, aws-batch
  Running setup.py install for aws-batch ... done
Successfully installed aws-batch-0.6 awscli-1.19.21 botocore-1.20.21 docutils-0.15.2 pyasn1-0.4.8 rsa-4.7.2

Custom plugins

  • There is a directory at the root of this repository called plugins. It contains a sample plugin ssh_plugin.py
  • In this directory, create a file for your new custom plugin. For example:
ssh_plugin.py
  • (Optional) Add any Python dependencies to requirements/requirements.txt.

Note: this step assumes you have a DAG that corresponds to the custom plugin. For examples, see MWAA Code Examples.

What's next?

FAQs

The following section contains common questions and answers you may encounter when using your Docker container image.

Can I test execution role permissions using this repository?

How do I add libraries to requirements.txt and test install?

  • A requirements.txt file is included in the /requirements folder of your local Docker container image. We recommend adding libraries to this file, and running locally.

What if a library is not available on PyPi.org?

Troubleshooting

The following section contains errors you may encounter when using the Docker container image in this repository.

My environment is not starting - process failed with dag_stats_table already exists

  • If you encountered the following error: process fails with "dag_stats_table already exists", you'll need to reset your database using the following command:
./mwaa-local-env reset-db

Fernet Key InvalidToken

A Fernet Key is generated during image build (./mwaa-local-env build-image) and is durable throughout all containers started from that image. This key is used to encrypt connection passwords in the Airflow DB. If changes are made to the image and it is rebuilt, you may get a new key that will not match the key used when the Airflow DB was initialized, in this case you will need to reset the DB (./mwaa-local-env reset-db).

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.