Skip to content

Latest commit

 

History

History
603 lines (497 loc) · 49.7 KB

File metadata and controls

603 lines (497 loc) · 49.7 KB

SIMPHERA Reference Architecture for AWS

This repository contains the reference architecture of the infrastructure needed to deploy dSPACE SIMPHERA to AWS. It does not contain the helm chart needed to deploy SIMPHERA itself, but only the base infrastructure such as Kubernetes, PostgreSQL, S3 buckets, etc.

You can use the reference architecture as a starting point for your SIMPHERA installation if you plan to deploy SIMPHERA to AWS. You can use the reference architecture as it is and only have to configure few individual values. If you have special requirements feel free to adapt the architecture to your needs. For example, the reference architecture does not contain any kind of VPN connection to a private, on-premise network because this is highly user specific. But the reference architecture is configured in such a way that the ingress points are available in the public internet.

Using the reference architecture you can deploy a single or even multiple instances of SIMPHERA, e.g. one for production and one for testing.

Architecture

The following figure shows the main resources of the architecture: SIMPHERA Reference Architecture for AWS The main building brick of the SIMPHERA reference architecture for AWS is the Amazon EKS cluster. The cluster contains two auto scaling groups: The first group is reserved for SIMPHERA services and other auxiliary third-party services like Keycloak, nginx, etc. The second group is for the executors that perform the testing of the system under test. The data for SIMPHERA projects is stored in a Amazon RDS PostgreSQL instance. Keycloak stores SIMPHERA users in a separate Amazon RDS PostgreSQL instance. Executors need licenses to execute tests and simulations. They obtain the licenses from a license server. The license server is deployed on an EC2 instance. Project files and test results are stored in an non-public Amazon S3 bucket. For the initial setup of the license server, several files need to be exchanged between an administration PC and the license server. These files are exchanged via an non-public S3 bucket that can be read and written from the administration PC and the license server. A detailed list of the AWS resources that are mandatory/optional for the operation of SIMPHERA can be found in the AWSCloudSpec.

Billable Resources and Services

Charges may apply for the following AWS resources and services:

Service Description Mandatory?
Amazon Elastic Kubernetes Service A Kubernetes cluster is required to run SIMPHERA. Yes
Amazon Virtual Private Cloud Virtual network for SIMPHERA. Yes
Elastic Load Balancing SIMPHERA uses a network load balancer. Yes
Amazon EC2 Auto Scaling SIMPHERA automatically scales compute nodes if the capacity is exhausted. Yes
Amazon Relational Database Project and authorization data is stored in Amazon RDS for PostgreSQL instances. Yes
Amazon Simple Storage Service Binary artifacts are stored in an S3 bucket. Yes
Amazon Elastic File System Binary artifacts are stored temporarily in EFS. Yes
AWS Key Management Service (AWS KMS) Encryption for Kubernetes secrets is enabled by default.
Amazon Elastic Compute Cloud Optionally, you can deploy a dSPACE license server on an EC2 instance. Alternatively, you can deploy the server on external infrastructure. For additional information, please contact our support team.
Amazon CloudWatch Metrics and container logs to CloudWatch. It is recommended to deploy the dSPACE monitoring stack in Kubernetes.

Usage Instructions

To create the AWS resources that are required for operating SIMPHERA, you need to accomplish the following tasks:

  1. install Terraform on your local administration PC
  2. register an AWS account where the resources needed for SIMPHERA are created
  3. create an IAM user with least privileges required to create the resources for SIMPHERA
  4. create security credentials for that IAM user
  5. request service quota increase for gpu instances if needed
  6. create non-public S3 bucket for Terraform state
  7. create IAM policy that gives the IAM user access to the S3 bucket
  8. clone this repository onto your local administration PC
  9. create Secrets manager secrets
  10. adjust Terraform variables
  11. apply Terraform configuration
  12. connect to the Kubernetes cluster

Install Terraform

This reference architecture is provided as a Terraform configuration. Terraform is an open-source command line tool to automatically create and manage cloud resources. A Terraform configuration consists of various .tf text files. These files contain the specifications of the resources to be created in the cloud infrastructure. That is the reason why this approach is called infrastructure-as-code. The main advantage of this approach is reproducibility because the configuration can be mainted in a source control system such as Git.

Terraform uses variables to make the specification configurable. The concrete values for these variables are specified in .tfvars files. So it is the task of the administrator to fill the .tfvars files with the correct values. This is explained in more detail in a later chapter.

Terraform has the concept of a state. On the one hand side there are the resource specifications in the .tf files. On the other hand there are the resources in the cloud infrastructure that are created based on these files. Terraform needs to store mapping information which element of the specification belongs to which resource in the cloud infrastructure. This mapping is called the state. In general you could store the state on your local hard drive. But that is not a good idea because in that case nobody else could change some settings and apply these changes. Therefore the state itself should be stored in the cloud.

Request service quota for gpu computing instances

If you want to run AURELION with your SIMPHERA solution, you need to add gpu instances to your cluster.

In case you want to add a gpu node pool to your AWS infrastructure, you might have to increase the quota for the gpu instance type you have selected. Per default, the SIMPHERA Reference Architecture for AWS uses g5.2xlarge instances. The quota Running On-Demand P instances sets the maximum number of vCPUs assigned to the Running On-Demand P instances for a specific AWS region. Every g5.2xlarge instance has 8 vCPUs, which is why the quota has to be at least 8 for the AWS region where you want to deploy the instances.

Create Security Credentials

You can create security credentials for that IAM user with the AWS console. Terraform uses these security credentials to create AWS resources on your behalf.

On your administration PC you need to install the Terraform command and the AWS CLI. To configure your aws account run the following command:

aws configure --profile <profile-name>

AWS Access Key ID [None]: *********
AWS Secret Access Key [None]: *******
Default region name [None]: eu-central-1
Default output format [None]: json

If you have been provided with session token, you can add it via following command:

aws configure set aws_session_token "<your_session_token>" --profile <profile-name>

Access credentials are typically stored in ~/.aws/credentials and configurations in ~/.aws/config. There are various ways on how to authenticate, to run Terraform. This depends on your specific setup.

Verify connectivity and your access credentials by executing following command:

aws sts get-caller-identity

{
    "UserId": "REWAYDCFMNYCPKCWRZEHT:[email protected]",
    "Account": "592245445799",
    "Arn": "arn:aws:sts::592245445799:assumed-role/AWSReservedSSO_AdministratorAccess_vmcbaym7ueknr9on/[email protected]"
}

Create State Bucket

As mentioned before, Terraform stores the state of the resources it creates within an S3 bucket. The bucket name needs to be globally unique.

After you have created the bucket, you need to link it with Terraform: To do so, please make a copy of the file state-backend-template, name it state-backend.tf and open the file in a text editor. With this backend configuration, Terraform stores the state as a given key in the given S3 bucket you have created before.

terraform {
  backend "s3" {
    #The name of the bucket to be used to store the terraform state. You need to create this container manually.
    bucket = "terraform-state"
    #The name of the file to be used inside the container to be used for this terraform state.
    key    = "simphera.tfstate"
    #The region of the bucket.
    region = "eu-central-1"
  }
}

Important: It is highly recommended to enable server-side encryption of the state file. Encryption is not enabled per default.

Create IAM Policy for State Bucket

Create the following IAM policy for accessing the Terraform state bucket and assign it to the IAM user:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "<your_account_arn>"
            },
            "Action": "s3:ListBucket",
            "Resource": "arn:aws:s3:::terraform-state"
        },
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "<your_account_arn>"
            },
            "Action": [
                "s3:GetObject",
                "s3:PutObject"
            ],
            "Resource": "arn:aws:s3:::terraform-state/<storage_key_state_backend>"
        }
    ]
}

Your account ARN (Amazon Resource Number) is in the output of aws sts get-caller-identity command.

Create Secrets Manager Secrets

Username and password for the PostgreSQL databases are stored in AWS Secrets Manager. Before you let Terraform create AWS resources, you need to manually create a Secrets Manager secret that stores the username and password. It is recommended to create individual secrets per SIMPHERA instance (e.g. production and staging instance). To create the secret, open the Secrets Manager console and click the button Store a new secret. As secret type choose Other type of secret. The password must contain from 8 to 128 characters and must not contain any of the following: / (slash), '(single quote), "(double quote) and @ (at sign). Open the Plaintext tab and paste the following JSON object and enter your usernames and passwords:

{
  "postgresql_password": "<your password>"
}

Alternatively, you can create the secret with the following Powershell script:

$region = "<your region>"
$postgresqlCredentials = @"
{
    "postgresql_password" : "<your password>"
}
"@ | ConvertFrom-Json | ConvertTo-Json -Compress
$postgresqlCredentials = $postgresqlCredentials -replace '([\\]*)"', '$1$1\"'
aws secretsmanager create-secret --name <secret name> --secret-string $postgresqlCredentials --region $region

On the next page you can define a name for the secret. Automatic credentials rotation is currently not supported by SIMPHERA, but you can rotate secrets manually. You have to provide the name of the secret in your Terraform variables. The next section describes how you need to adjust your Terraform variables.

Adjust Terraform Variables

For your configuration, please rename the template file terraform.tfvars.example to terraform.tfvars and open it in a text editor. This file contains all variables that are configurable including documentation of the variables. Please adapt the values before you deploy the resources.

simpheraInstances = {
  "production" = {
+    secretname = "<secret name>"
    }
}

Also rename the file providers.tf.example to main.tf and fill in the name of the AWS profile you have created before.

provider "aws" {
+  profile = "<profile-name>"
}

Apply Terraform Configuration

Before you can deploy the resources to AWS you have to initialize Terraform:

terraform init

Afterwards you can deploy the resources:

terraform apply

Terraform automatically loads the variables from your terraform.tfvars variable definition file. Installation times may very, but it is expected to take up to 30 min to complete the deployment. Note that eks-addons module dependency on managed node group(s) is commented out in k8s.tf file. This might increase deployment time, as various addons might be provisioned before any actual K8s worker node starts, to complete addon deployment. Default timeout for node/addon deployment is 20 minutes, so please be patient. If this behaviour creates problems, you can always uncomment line depends_on = [module.eks.managed_node_groups]. It is recommended to use AWS admin account, or ask your AWS administrator to assign necessary IAM roles and permissions to your user.

Destroy Infrastructure

Resources that contain data, i.e. the databases, S3 storage, and the recovery points in the backup vault are protected against unintentional deletion. :warning: If you continue with the procedure described in this section, your data will be irretrievably deleted.

Before the backup vault can be deleted, all the continuous recovery points for S3 storage and the databases need to be deleted, for example by using the following Powershell snippet:

$vaults = terraform output backup_vaults | ConvertFrom-Json
$profile = "<profile_name>"
foreach ($vault in $vaults){
  Write-Host "Deleting $vault"
  $recoverypoints = aws backup list-recovery-points-by-backup-vault --profile $profile --backup-vault-name $vault | ConvertFrom-Json
  foreach ($rp in $recoverypoints.RecoveryPoints){
    aws backup delete-recovery-point --profile $profile --backup-vault-name $vault --recovery-point-arn $rp.RecoveryPointArn
  }
  foreach ($rp in $recoverypoints.RecoveryPoints){
    Do
    {
      Start-Sleep -Seconds 10
      aws backup describe-recovery-point --profile $profile --backup-vault-name $vault --recovery-point-arn $rp.RecoveryPointArn | ConvertFrom-Json
    } while( $LASTEXITCODE -eq 0)
  }
  aws backup delete-backup-vault --profile $profile --backup-vault-name $vault
}

Before the databases can be deleted, you need to remove their delete protection:

$databases = terraform output database_identifiers | ConvertFrom-Json
foreach ($db in $databases){
  Write-Host "Deleting database $db"
  aws rds modify-db-instance --profile $profile --db-instance-identifier $db --no-deletion-protection
  aws rds delete-db-instance --profile $profile --db-instance-identifier $db --skip-final-snapshot
}

To delete the S3 buckets that contains both versioned and non-versioned objects, the buckets must first be emptied. The following PowerShell script can be used to erase all objects within the buckets and then delete the buckets.

$aws_profile = "<profile_name>"
$buckets = terraform output s3_buckets | ConvertFrom-Json
foreach ($bucket in $buckets) {
    Write-Output "Deleting bucket: $bucket" 
    $deleteObjDict = @{}
    $deleteObj = New-Object System.Collections.ArrayList
    aws s3api list-object-versions --bucket $bucket --profile $aws_profile --query '[Versions[*].{ Key:Key , VersionId:VersionId} , DeleteMarkers[*].{ Key:Key , VersionId:VersionId}]' --output json `
    | ConvertFrom-Json | ForEach-Object { $_ } | ForEach-Object { $deleteObj.add($_) } | Out-Null
    $n = [math]::Ceiling($deleteObj.Count / 100)
    for ($i = 0; $i -lt $n; $i++) {
        $deleteObjDict["Objects"] = $deleteObj[(0 + $i * 100)..(100 * ($i + 1))]
        $deleteObjDict["Objects"] = $deleteObjDict["Objects"] | Where-Object { $_ -ne $null }
        $deleteStuff = $deleteObjDict | ConvertTo-Json
        aws s3api delete-objects --bucket $bucket --profile $aws_profile --delete $deleteStuff | Out-Null
    }
    aws s3 rb s3://$bucket --force --profile $aws_profile
    Write-Output "$bucket bucket deleted"
}

The remaining infrastructure resources can be deleted via Terraform by running the following command.

terraform destroy

Connect to Kubernetes Cluster

This deployment contains a managed Kubernetes cluster (EKS). In order to use command line tools such as kubectl or helm you need a kubeconfig configuration file. You can update your kubeconfig using the aws cli update-kubeconfig command:

aws eks --region <region> update-kubeconfig --name <cluster_name> --kubeconfig <filename>

Backup and Restore

SIMPHERA stores data in the PostgreSQL database and in S3 buckets (MinIO) that needs to be backed up. AWS supports continuous backups for Amazon RDS for PostgreSQL and S3 that allows point-in-time recovery. Point-in-time recovery lets you restore your data to any point in time within a defined retention period.

This Terraform module creates an AWS backup plan that makes continuous backups of the PostgreSQL database and S3 buckets. The backups are stored in an AWS backup vault per SIMPHERA instance. An IAM role is also automatically created that has proper permissions to create backups. To enable backups for your SIMPHERA instance, make sure you have the flag enable_backup_service et in your .tfvars file:

simpheraInstances = {
  "production" = {
        enable_backup_service    = true
    }
}

Amazon RDS for PostgreSQL

Create an target RDS instance (backup server) that is a copy of a source RDS instance (production server) of a specific point-in-time. The command restore-db-instance-to-point-in-time creates the target database. Most of the configuration settings are copied from the source database. To be able to connect to the target instance the easiest way is to explicitly set the same security group and subnet group as used for the source instance.

Restoring an RDS instance can be done via Powershell as described in the remainder:

aws rds restore-db-instance-to-point-in-time --source-db-instance-identifier simphera-reference-production-simphera --target-db-instance simphera-reference-production-simphera-backup --vpc-security-group-ids sg-0b954a0e25cd11b6d --db-subnet-group-name simphera-reference-vpc --restore-time 2022-06-16T23:45:00.000Z --tags Key=timestamp,Value=2022-06-16T23:45:00.000Z

Execute the following command to create the pgdump pod using the standard postgres image and open a bash:

kubectl run pgdump -ti -n simphera --image postgres --kubeconfig .\kube.config -- bash

In the pod's Bash, use the pg_dump and pg_restore commands to stream the data from the backup server to the production server:

pg_dump -h simphera-reference-production-simphera-backup.cexy8brfkmxk.eu-central-1.rds.amazonaws.com -p 5432 -U dbuser -Fc simpherareferenceproductionsimphera | pg_restore --clean --if-exists -h simphera-reference-production-simphera.cexy8brfkmxk.eu-central-1.rds.amazonaws.com -p 5432 -U dbuser -d simpherareferenceproductionsimphera

Alternatively, you can restore the RDS instance via the AWS console.

S3

This Terraform creates an S3 bucket for project data and results and enables versioning of the S3 bucket which is a requirement for point-in-time recovery.

To restore the S3 buckets to an older version you need to create an IAM role that has proper permissions:

$rolename = "restore-role"
$trustrelation = @"
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": ["sts:AssumeRole"],
      "Effect": "allow",
      "Principal": {
        "Service": ["backup.amazonaws.com"]
      }
    }
  ]
}
"@
echo $trustrelation > trust.json
aws iam create-role --role-name $rolename --assume-role-policy-document file://trust.json --description "Role to restore"
aws iam attach-role-policy --role-name $rolename --policy-arn="arn:aws:iam::aws:policy/AWSBackupServiceRolePolicyForS3Restore"
aws iam attach-role-policy --role-name $rolename --policy-arn="arn:aws:iam::aws:policy/service-role/AWSBackupServiceRolePolicyForRestores"
$rolearn=aws iam get-role --role-name $rolename --query 'Role.Arn'

Restoring an S3 bucket can be done via Powershell as described in the remainder: You can restore the S3 data in-place, into another existing bucket, or into a new bucket.

$uuid = New-Guid
$metadata = @"
{
  "DestinationBucketName": "man-validation-platform-int-results",
  "NewBucket": "true",
  "RestoreTime": "2022-06-20T23:45:00.000Z",
  "Encrypted": "false",
  "CreationToken": "$uuid"
}
"@
$metadata = $metadata -replace '([\\]*)"', '$1$1\"'
aws backup start-restore-job `
--recovery-point-arn "arn:aws:backup:eu-central-1:012345678901:recovery-point:continuous:simphera-reference-production-0f51c39b" `
--iam-role-arn $rolearn `
--metadata $metadata

Alternatively, you can restore the S3 data via the AWS console.

Encryption

Encryption is enabled at all AWS resources that are created by Terraform:

  • PostgreSQL databases
  • S3 buckets
  • EFS (Elastic file system)
  • CloudWatch logs
  • Backup Vault

List of tools with versions needed for Simphera reference architecture deployment

Tool name Version
AWS CLI >=2.10.0
Helm >=3.8.0
Terraform >=1.3.0
kubectl >=1.27.0

Requirements

Name Version
terraform >= 1.3.0
aws = 5.37.0
helm >= 2.4.1
kubernetes >= 2.10
random >= 3.0.0

Providers

Name Version
aws 5.37.0
kubernetes 2.30.0
random 3.6.2

Modules

Name Source Version
eks git::https://github.com/aws-ia/terraform-aws-eks-blueprints.git v4.32.1
k8s_eks_addons ./modules/k8s_eks_addons n/a
security_group terraform-aws-modules/security-group/aws ~> 4
security_group_license_server terraform-aws-modules/security-group/aws ~> 4
simphera_instance ./modules/simphera_aws_instance n/a
vpc terraform-aws-modules/vpc/aws v5.8.1

Resources

Name Type
aws_autoscaling_group_tag.default_node-template_resources_ephemeral-storage resource
aws_autoscaling_group_tag.execnodes resource
aws_autoscaling_group_tag.execnodes_node-template_resources_ephemeral-storage resource
aws_autoscaling_group_tag.gpuexecnodes resource
aws_autoscaling_group_tag.gpuexecnodes_node-template_resources_ephemeral-storage resource
aws_autoscaling_group_tag.gpuivsnodes resource
aws_cloudwatch_log_group.flowlogs resource
aws_cloudwatch_log_group.ssm_install_log_group resource
aws_cloudwatch_log_group.ssm_scan_log_group resource
aws_ecr_pull_through_cache_rule.dspacecloudreleases resource
aws_efs_file_system.efs_file_system resource
aws_efs_file_system_policy.policy resource
aws_efs_mount_target.mount_target resource
aws_flow_log.flowlog resource
aws_iam_instance_profile.license_server_profile resource
aws_iam_policy.ecr_policy resource
aws_iam_policy.flowlogs_policy resource
aws_iam_policy.license_server_policy resource
aws_iam_role.flowlogs_role resource
aws_iam_role.license_server_role resource
aws_iam_role_policy_attachment.eks-attach-ecr resource
aws_iam_role_policy_attachment.flowlogs_attachment resource
aws_iam_role_policy_attachment.license_server_ssm resource
aws_iam_role_policy_attachment.minio_policy_attachment resource
aws_instance.license_server resource
aws_kms_key.kms_key_cloudwatch_log_group resource
aws_s3_bucket.bucket_logs resource
aws_s3_bucket.license_server_bucket resource
aws_s3_bucket_logging.logging resource
aws_s3_bucket_policy.buckets_logs_ssl resource
aws_s3_bucket_policy.license_server_bucket_ssl resource
aws_s3_bucket_public_access_block.buckets_logs_access resource
aws_s3_bucket_server_side_encryption_configuration.bucket_logs_encryption resource
aws_secretsmanager_secret.ecr_pullthroughcache_dspacecloudreleases resource
aws_secretsmanager_secret_version.ecr_credentials resource
aws_ssm_maintenance_window.install resource
aws_ssm_maintenance_window.scan resource
aws_ssm_maintenance_window_target.install resource
aws_ssm_maintenance_window_target.scan resource
aws_ssm_maintenance_window_target.scan_eks_nodes resource
aws_ssm_maintenance_window_task.install resource
aws_ssm_maintenance_window_task.scan resource
aws_ssm_patch_baseline.production resource
aws_ssm_patch_group.patch_group resource
kubernetes_storage_class_v1.efs resource
random_string.policy_suffix resource
aws_ami.al2gpu_ami data source
aws_ami.amazon_linux_kernel5 data source
aws_availability_zones.available data source
aws_caller_identity.current data source
aws_eks_node_group.default data source
aws_eks_node_group.execnodes data source
aws_eks_node_group.gpuexecnodes data source
aws_eks_node_group.gpuivsnodes data source
aws_iam_policy_document.eks_node_custom_inline_policy data source
aws_iam_policy_document.policy data source
aws_partition.current data source
aws_region.current data source
aws_subnet.private_subnet data source
aws_subnet.public_subnet data source
aws_subnets.private_subnets data source
aws_subnets.public_subnets data source
aws_vpc.preconfigured data source

Inputs

Name Description Type Default Required
aws_load_balancer_controller_config Input configuration for load_balancer_controller deployed with helm release. By setting key 'enable' to 'true', load_balancer_controller release will be deployed. 'helm_repository' is an URL for the repository of load_balancer_controller helm chart, where 'helm_version' is its respective version of a chart. 'chart_values' is used for changing default values.yaml of a load_balancer_controller chart.
object({
enable = optional(bool, false)
helm_repository = optional(string, "https://aws.github.io/eks-charts")
helm_version = optional(string, "1.4.5")
chart_values = optional(string, <<-YAML

YAML
)
})
{
"enable": false
}
no
cloudwatch_retention Global cloudwatch retention period for the EKS, VPC, SSM, and PostgreSQL logs. number 7 no
cluster_autoscaler_config Input configuration for cluster-autoscaler deployed with helm release. By setting key 'enable' to 'true', cluster-autoscaler release will be deployed. 'helm_repository' is an URL for the repository of cluster-autoscaler helm chart, where 'helm_version' is its respective version of a chart. 'chart_values' is used for changing default values.yaml of a cluster-autoscaler chart.
object({
enable = optional(bool, true)
helm_repository = optional(string, "https://kubernetes.github.io/autoscaler")
helm_version = optional(string, "9.37.0")
chart_values = optional(string, <<-YAML

YAML
)
})
{} no
codemeter Download link for codemeter rpm package. string "https://www.wibu.com/support/user/user-software/file/download/13346.html?tx_wibudownloads_downloadlist%5BdirectDownload%5D=directDownload&tx_wibudownloads_downloadlist%5BuseAwsS3%5D=0&cHash=8dba7ab094dec6267346f04fce2a2bcd" no
coredns_config Input configuration for AWS EKS add-on coredns. By setting key 'enable' to 'true', coredns add-on is deployed. Key 'configuration_values' is used to change add-on configuration. Its content should follow add-on configuration schema (see https://aws.amazon.com/blogs/containers/amazon-eks-add-ons-advanced-configuration/).
object({
enable = optional(bool, true)
configuration_values = optional(string, null)
})
{
"enable": true
}
no
ecr_pullthrough_cache_rule_config Specifies if ECR pull through cache rule and accompanying resources will be created. Key 'enable' indicates whether pull through cache rule needs to be enabled for the cluster. When 'enable' is set to 'true', key 'exist' indicates whether pull through cache rule already exists for region's private ECR. If key 'enable' is set to 'true', IAM policy will be attached to the cluster's nodes. Additionally, if 'exist' is set to 'false', credentials for upstream registry and pull through cache rule will be created
object({
enable = bool
exist = bool
})
{
"enable": false,
"exist": false
}
no
enable_ivs n/a bool false no
enable_patching Scans license server EC2 instance and EKS nodes for updates. Installs patches on license server automatically. EKS nodes need to be updated manually. bool false no
gpuNodeCountMax The maximum number of nodes for gpu job execution number 12 no
gpuNodeCountMin The minimum number of nodes for gpu job execution number 0 no
gpuNodeDiskSize The disk size in GiB of the nodes for the gpu job execution number 100 no
gpuNodePool Specifies whether an additional node pool for gpu job execution is added to the kubernetes cluster bool false no
gpuNodeSize The machine size of the nodes for the gpu job execution list(string)
[
"g5.2xlarge"
]
no
gpu_operator_config Input configuration for the GPU operator chart deployed with helm release. By setting key 'enable' to 'true', GPU operator will be deployed. 'helm_repository' is an URL for the repository of the GPU operator helm chart, where 'helm_version' is its respective version of a chart. 'chart_values' is used for changing default values.yaml of the GPU operator chart.
object({
enable = optional(bool, true)
helm_repository = optional(string, "https://helm.ngc.nvidia.com/nvidia")
helm_version = optional(string, "v24.9.0")
driver_version = optional(string, "550.90.07")
chart_values = optional(string, <<-YAML
operator:
defaultRuntime: containerd

dcgmExporter:
enabled: false

driver:
enabled: true

validator:
driver:
env:
- name: DISABLE_DEV_CHAR_SYMLINK_CREATION
value: "true"

toolkit:
enabled: true

daemonsets:
tolerations:
- key: purpose
value: gpu
operator: Equal
effect: NoSchedule

node-feature-discovery:
worker:
tolerations:
- key: purpose
value: gpu
operator: Equal
effect: NoSchedule
YAML
)
})
{
"enable": false
}
no
infrastructurename The name of the infrastructure. e.g. simphera-infra string "simphera" no
ingress_nginx_config Input configuration for ingress-nginx service deployed with helm release. By setting key 'enable' to 'true', ingress-nginx service will be deployed. 'helm_repository' is an URL for the repository of ingress-nginx helm chart, where 'helm_version' is its respective version of a chart. 'chart_values' is used for changing default values.yaml of an ingress-nginx chart.
object({
enable = bool
helm_repository = optional(string, "https://kubernetes.github.io/ingress-nginx")
helm_version = optional(string, "4.1.4")
chart_values = optional(string, <<-YAML
controller:
images:
registry: "registry.k8s.io"
service:
annotations:
service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
YAML
)
})
{
"enable": false
}
no
install_schedule 6-field Cron expression describing the install maintenance schedule. Must not overlap with variable scan_schedule. string "cron(0 3 * * ? *)" no
ivsGpuNodeCountMax The maximum number of GPU nodes nodes for IVS jobs number 2 no
ivsGpuNodeCountMin The minimum number of GPU nodes nodes for IVS jobs number 0 no
ivsGpuNodeDiskSize The disk size in GiB of the nodes for the IVS gpu job execution number 100 no
ivsGpuNodePool Specifies whether an additional node pool for IVS gpu job execution is added to the kubernetes cluster bool false no
ivsGpuNodeSize The machine size of the GPU nodes for IVS jobs list(string)
[
"g4dn.2xlarge"
]
no
kubernetesVersion The kubernetes version of the EKS cluster. string "1.30" no
licenseServer Specifies whether a license server VM will be created. bool false no
linuxExecutionNodeCountMax The maximum number of Linux nodes for the job execution number 10 no
linuxExecutionNodeCountMin The minimum number of Linux nodes for the job execution number 0 no
linuxExecutionNodeDiskSize The disk size in GiB of the nodes for the job execution number 200 no
linuxExecutionNodeSize The machine size of the Linux nodes for the job execution, user must check the availability of the instance types for the region. The list is ordered by priority where the first instance type gets the highest priority. Instance types must fulfill the following requirements: 64 GB RAM, 16 vCPUs, at least 110 IPs, at least 2 availability zones. list(string)
[
"m6a.4xlarge",
"m5a.4xlarge",
"m5.4xlarge",
"m6i.4xlarge",
"m4.4xlarge",
"m7i.4xlarge",
"m7a.4xlarge"
]
no
linuxNodeCountMax The maximum number of Linux nodes for the regular services number 12 no
linuxNodeCountMin The minimum number of Linux nodes for the regular services number 1 no
linuxNodeDiskSize The disk size in GiB of the nodes for the regular services number 200 no
linuxNodeSize The machine size of the Linux nodes for the regular services, user must check the availability of the instance types for the region. The list is ordered by priority where the first instance type gets the highest priority. Instance types must fulfill the following requirements: 64 GB RAM, 16 vCPUs, at least 110 IPs, at least 2 availability zones. list(string)
[
"m6a.4xlarge",
"m5a.4xlarge",
"m5.4xlarge",
"m6i.4xlarge",
"m4.4xlarge",
"m7i.4xlarge",
"m7a.4xlarge"
]
no
maintainance_duration How long in hours for the maintenance window. number 3 no
map_accounts Additional AWS account numbers to add to the aws-auth ConfigMap list(string) [] no
map_roles Additional IAM roles to add to the aws-auth ConfigMap
list(object({
rolearn = string
username = string
groups = list(string)
}))
[] no
map_users Additional IAM users to add to the aws-auth ConfigMap
list(object({
userarn = string
username = string
groups = list(string)
}))
[] no
private_subnet_ids List of IDs for the private subnets. list(any) [] no
public_subnet_ids List of IDs for the public subnets. list(any) [] no
rtMaps_link Download link for RTMaps license server. string "http://dl.intempora.com/RTMaps4/rtmaps_4.9.0_ubuntu1804_x86_64_release.tar.bz2" no
s3_csi_config Input configuration for AWS EKS add-on aws-mountpoint-s3-csi-driver. By setting key 'enable' to 'true', aws-mountpoint-s3-csi-driver add-on is deployed. Key 'configuration_values' is used to change add-on configuration. Its content should follow add-on configuration schema (see https://aws.amazon.com/blogs/containers/amazon-eks-add-ons-advanced-configuration/).
object({
enable = optional(bool, false)
configuration_values = optional(string, <<-YAML
node:
tolerateAllTaints: true
YAML
)
})
{
"enable": false
}
no
scan_schedule 6-field Cron expression describing the scan maintenance schedule. Must not overlap with variable install_schedule. string "cron(0 0 * * ? *)" no
simpheraInstances A list containing the individual SIMPHERA instances, such as 'staging' and 'production'.
map(object({
name = string
postgresqlApplyImmediately = bool
postgresqlVersion = string
postgresqlStorage = number
postgresqlMaxStorage = number
db_instance_type_simphera = string
enable_keycloak = bool
postgresqlStorageKeycloak = number
postgresqlMaxStorageKeycloak = number
db_instance_type_keycloak = string
k8s_namespace = string
secretname = string
enable_backup_service = bool
backup_retention = number
enable_deletion_protection = bool

}))
{
"production": {
"backup_retention": 35,
"db_instance_type_keycloak": "db.t4g.large",
"db_instance_type_simphera": "db.t4g.large",
"enable_backup_service": true,
"enable_deletion_protection": true,
"enable_keycloak": true,
"k8s_namespace": "simphera",
"name": "production",
"postgresqlApplyImmediately": false,
"postgresqlMaxStorage": 100,
"postgresqlMaxStorageKeycloak": 100,
"postgresqlStorage": 20,
"postgresqlStorageKeycloak": 20,
"postgresqlVersion": "16",
"secretname": "aws-simphera-dev-production"
}
}
no
tags The tags to be added to all resources. map(any) {} no
vpcCidr The CIDR for the virtual private cluster. string "10.1.0.0/18" no
vpcId The ID of preconfigured VPC. Change from 'null' to use already existing VPC. string null no
vpcPrivateSubnets List of CIDRs for the private subnets. list(any)
[
"10.1.0.0/22",
"10.1.4.0/22",
"10.1.8.0/22"
]
no
vpcPublicSubnets List of CIDRs for the public subnets. list(any)
[
"10.1.12.0/22",
"10.1.16.0/22",
"10.1.20.0/22"
]
no

Outputs

Name Description
account_id The AWS account id used for creating resources.
backup_vaults Backups vaults from all SIMPHERA instances.
database_endpoints Identifiers of the SIMPHERA and Keycloak databases from all SIMPHERA instances.
database_identifiers Identifiers of the SIMPHERA and Keycloak databases from all SIMPHERA instances.
eks_cluster_id Amazon EKS Cluster Name
pullthrough_cache_prefix n/a
s3_buckets S3 buckets from all SIMPHERA instances.