This example app is a full AWS Glue setup that deploys a Glue job using Python Shell Script and a Glue job using PySpark.
This example was created as a composition of the following examples:
We have two options to run the services locally:
- Using VSCode + Remote Containers (recommended)
- Using Docker Compose - (longer, but more flexible)
- Install Docker
- Install VSCode
- Install the Remote Development extension
- Clone this repository
- Create your application within a container (see gif below)
after the container is running inside VSCode, you can try to run the jobs locally.
In the following gif we execute the following commands:
glue-spark-submit src/jobs/pyspark.py --JOB_NAME job_example --CUSTOM_ARGUMENT custom_value
Check the full development documentation to learn how to setup a local development environment for DataNaN using Docker Compose.
We use Serverless Framework to deploy the AWS Glue jobs among other resources. Check the full deployment documentation to learn how to deploy the AWS Glue jobs.