A Slalom DataOps Lab
For this lab, you'll need:
- Installed DevOps Tools:
- VS Code, Python, and Terraform
- A GitHub account
- DBeaver Universal Database Tool
- On Mac:
brew cask install dbeaver-commmunity
- On Windows:
choco install dbeaver
- On Mac:
Option A: Start from Previous Lab (Recommended):
Use this option if you've already completed the previous lab, and have successfully run
terraform apply
.
If you've completed the previous lab, and if you used the recommended Linux Academy Playground to do so, you're 4-hour limited AWS environment has likely been reset. Follow these instructions to get a new environment and reset your git repo to use this new account.
- Create a new AWS Sandbox environment at playground.linuxacademy.com.
- Update your credentials file (
.secrets/aws-credentials
) with the newly provided Access Key ID and Secret Access Key. - Navigate to your
infra
folder and rename theterraform.tfstate
file toterraform.tfstate.old
.- IMPORTANT: In a real-world environment, you never want to delete or corrupt your "tfstate" file, since the state file is Terraform's way of tracking the resources it is responsible for. In this unique case, however, our environment has been already been purged by the Linux Academy 4-hour time limit, and by renaming this file we are able to start fresh in a new account.
That's it! You should now be able to run terraform apply
again, which will
recreate the same baseline environment you created in the previous
"data-lake" lab.
Option B: Starting from Scratch:
If you've not yet completed the data lake lab, go back and do so now. (You can safely skip all exercises labeled "Extra Credit".)
-
Create a new file called
02_databases.tf
in yourinfra
folder. -
Copy-paste the following code into your new file:
module "postgres" { source = "git::https://github.com/slalom-ggp/dataops-infra.git//catalog/aws/postgres?ref=main" name_prefix = "${local.name_prefix}postgres-" environment = module.env.environment resource_tags = local.resource_tags identifier = "my-postgres-db" admin_username = "postgresadmin" admin_password = "asdf1234" skip_final_snapshot = true } output "postgres_summary" { value = module.postgres.summary }
-
Review the configuration variables in the module and compare with the full list of configuration options in the documentation here.
- Open a new terminal in the
infra
folder (Right-clickinfra
folder and selectOpen in Integrated Terminal
). - Run
terraform init
and then runterraform apply
to deploy your changes changes.- Note that if you've already deployed the data lake lab, no changes will be proposed to S3 buckets, the VPC, or the Subnets.
-
Using DBeaver and the connection information provided by
terraform output
, connect to your new database. -
In a new SQL Editor tab (
SQL Editor
menu ->New SQL Editor
), paste and run the following commands to test that the database is working properly.create table test_table as select 42 as TheAnswer, 'N/A' as TheQuestion; select * from test_table;
Hitchhiker trivia: "Why 42?"
NOTE: The steps below also work with MySQL. Simply replace all
redshift
references withmysql
.
In this step, you'll create a new Redshift cluster in the same way you created a Postgres database.
Option 1: Search and Replace:
- In VS Code, click anywhere in the
02_databases.tf
file and useCtrl+H
to open the search-and-replace tool. - Type "postgres" in the first box and "redshift" in the second box. Then use the
Replace
orReplace All
buttons to modify your code. - Remember to save your file with
Ctrl+S
. - Run
terraform init
(because the modulesource
has changed) and thenterraform apply
to deploy your new database.- Before (or after) typing 'yes' to confirm, take a minute or so to review the changes that terraform is proposing. Notice that Terraform just "figures out" what do do: what can be modified in place, and what needs to be deleted and recreated from scratch.
Option 2: Add to Existing:
-
At the end of the
02_databases.tf
file, paste in the code below. This will add an additional Redshift deployment to your existing configuration.module "redshift" { source = "git::https://github.com/slalom-ggp/dataops-infra.git//catalog/aws/redshift?ref=main" name_prefix = "${local.name_prefix}redshift-" environment = module.env.environment resource_tags = local.resource_tags identifier = "my-redshift-db" admin_username = "redshiftadmin" admin_password = "asdf1234" skip_final_snapshot = true } output "redshift_summary" { value = module.redshift.summary }
-
Remember to save your file with
Ctrl+S
. -
Run
terraform init
(because you've added a new module reference) and thenterraform apply
to deploy the new database.
Optionally, you can explore the below Infrastructure Catalog samples. Note the similarities to your own configurations.
In this section, you will explore the Terraform source code used in the terraform RDS and Redshift modules.
- Navigate to the Terraform doc to review the full set of options for RDS and Redshift:
- In a new tab, compare the above with the actual source code of the respective Infrastructure Catalog modules:
- Lastly, compare and contrast how the below two files each use the same "RDS" module to deploy the two different database platforms (note the difference in lines 23-24 of each file).
For troubleshooting tips, please see the Lab Troubleshooting Guide.