From 9c355477647760742841db2f112ae3d459c84687 Mon Sep 17 00:00:00 2001 From: milldr Date: Fri, 22 Nov 2024 12:51:18 -0500 Subject: [PATCH 1/8] docs on EKS FAQ --- docs/layers/eks/faq.mdx | 29 +++++++++++++++++++++++++++++ package-lock.json | 7 ++----- 2 files changed, 31 insertions(+), 5 deletions(-) diff --git a/docs/layers/eks/faq.mdx b/docs/layers/eks/faq.mdx index 968215e3a..35b8114f5 100644 --- a/docs/layers/eks/faq.mdx +++ b/docs/layers/eks/faq.mdx @@ -38,3 +38,32 @@ launch and scale runners for GitHub automatically. For more on how to set up ARC, see the [GitHub Action Runners setup docs for EKS](/layers/github-actions/eks-github-actions-controller/). + +## The managed nodes are successfully launching, but the worker nodes are not joining the cluster. What could be the issue? + +The most common issue is that the worker nodes are not able to communicate with the EKS cluster. This is usually due to missing cluster addons. If you connect to a node with session manager, you can check the kubelet logs. You might see an error like this: + +```console +kubelet ... "Failed to ensure lease exists, will retry" err="Unauthorized" interval="7s" +... csi_plugin.go:884] Failed to contact API server when waiting for CSINode publishing: Unauthorized +``` + +For the sake of version mapping, we have separated the cluster addon configuration into a single stack configuration file. That file has the version of the EKS cluster and the version of the addons that are compatible with that cluster version. + +The file is typically located at `stacks/catalog/eks/mixins/k8s-1-29.yaml` or `stacks/catalog/eks/cluster/mixins/k8s-1-29.yaml`, where `1.29` is the version of the EKS cluster. + +Make sure this file is imported and included with your stack. You can verify this by checking the final rendered configuration with Atmos: + +```bash +atmos describe component eks/cluster -s +``` + +## I am able to ping the cluster endpoint, but I am not able to connect to the cluster. What could be the issue? + +EKS cluster networking is complex. There are many issues the could cause this problem, so in our experience we recommend the AWS Reachability Analyzer. This tool can help you diagnose the issue by testing the network path between the source and destination. Make sure to test both directions. + +For example, we have found misconfigurations where the Security Group was not allowing traffic from the worker nodes to the EKS cluster. Or Transit Gateway was missing an account attachment. Or a subnet missing any given route. In all of these cases, the Reachability Analyzer exposes the issue. + +However, one particular issue we had to debug was related to a misconfiguration with subnet selection for managed nodes. Typically we set the EKS cluster to use private subnets for the managed nodes, with `cluster_private_subnets_only: true`. However, if this is not set, the managed nodes may choose public subnets in addition to private subnets. This can cause the cluster's control plane to be reachable by ping, but not properly configured nor accessible. + +Make sure to check the subnet selection for the managed nodes in the EKS cluster configuration. diff --git a/package-lock.json b/package-lock.json index 3f666001f..8d20ec124 100644 --- a/package-lock.json +++ b/package-lock.json @@ -6428,8 +6428,8 @@ "license": "MIT" }, "node_modules/custom-loaders": { - "resolved": "plugins/custom-loaders", - "link": true + "version": "0.0.0", + "resolved": "file:plugins/custom-loaders" }, "node_modules/cytoscape": { "version": "3.30.1", @@ -18860,9 +18860,6 @@ "type": "github", "url": "https://github.com/sponsors/wooorm" } - }, - "plugins/custom-loaders": { - "version": "0.0.0" } } } From 796bdb714d711aef54e2f1a0f618b896ce51af4c Mon Sep 17 00:00:00 2001 From: milldr Date: Fri, 22 Nov 2024 16:13:41 -0500 Subject: [PATCH 2/8] Reset package --- package-lock.json | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/package-lock.json b/package-lock.json index 8d20ec124..3f666001f 100644 --- a/package-lock.json +++ b/package-lock.json @@ -6428,8 +6428,8 @@ "license": "MIT" }, "node_modules/custom-loaders": { - "version": "0.0.0", - "resolved": "file:plugins/custom-loaders" + "resolved": "plugins/custom-loaders", + "link": true }, "node_modules/cytoscape": { "version": "3.30.1", @@ -18860,6 +18860,9 @@ "type": "github", "url": "https://github.com/sponsors/wooorm" } + }, + "plugins/custom-loaders": { + "version": "0.0.0" } } } From b9ab055f928dcf8e37a0066cf742fcd2b89843e6 Mon Sep 17 00:00:00 2001 From: milldr Date: Fri, 22 Nov 2024 16:33:06 -0500 Subject: [PATCH 3/8] improved EKS FAQ --- docs/layers/eks/faq.mdx | 43 ++++++++++++++++++++++++++--------------- 1 file changed, 27 insertions(+), 16 deletions(-) diff --git a/docs/layers/eks/faq.mdx b/docs/layers/eks/faq.mdx index 35b8114f5..c3d3375ac 100644 --- a/docs/layers/eks/faq.mdx +++ b/docs/layers/eks/faq.mdx @@ -39,31 +39,42 @@ launch and scale runners for GitHub automatically. For more on how to set up ARC, see the [GitHub Action Runners setup docs for EKS](/layers/github-actions/eks-github-actions-controller/). -## The managed nodes are successfully launching, but the worker nodes are not joining the cluster. What could be the issue? +## Managed nodes are successfully launching, but worker nodes are not joining the cluster -The most common issue is that the worker nodes are not able to communicate with the EKS cluster. This is usually due to missing cluster addons. If you connect to a node with session manager, you can check the kubelet logs. You might see an error like this: +Worker nodes are not joining the EKS cluster even though managed nodes are successfully launching. This often happens when worker nodes cannot communicate with the EKS cluster due to missing cluster add-ons. -```console -kubelet ... "Failed to ensure lease exists, will retry" err="Unauthorized" interval="7s" -... csi_plugin.go:884] Failed to contact API server when waiting for CSINode publishing: Unauthorized +Ensure that cluster add-ons compatible with your EKS cluster version are properly configured and included in your stack. Verify that the addon stack file (e.g., `stacks/catalog/eks/mixins/k8s-1-29.yaml`) is imported into your stack. You can confirm this by checking the final rendered component stack with Atmos: + +```bash +atmos describe component eks/cluster -s ``` -For the sake of version mapping, we have separated the cluster addon configuration into a single stack configuration file. That file has the version of the EKS cluster and the version of the addons that are compatible with that cluster version. +## I'm able to ping the cluster endpoint but unable to connect to the cluster -The file is typically located at `stacks/catalog/eks/mixins/k8s-1-29.yaml` or `stacks/catalog/eks/cluster/mixins/k8s-1-29.yaml`, where `1.29` is the version of the EKS cluster. +You can ping the EKS cluster endpoint but cannot connect to it using `kubectl` or other tools. This indicates a networking issue preventing proper communication with the cluster. -Make sure this file is imported and included with your stack. You can verify this by checking the final rendered configuration with Atmos: +Use the AWS Reachability Analyzer to diagnose the network path between your source and the EKS endpoint. Check for misconfigurations in security groups, Transit Gateway attachments, and subnet routes. Ensure that managed nodes are using private subnets by setting `cluster_private_subnets_only: true` in your EKS cluster configuration. -```bash -atmos describe component eks/cluster -s -``` +## AWS Client VPN clients not receiving routes to EKS cluster + +VPN clients connected via AWS Client VPN are not receiving routes to the EKS cluster’s VPC, preventing access to the API endpoint. -## I am able to ping the cluster endpoint, but I am not able to connect to the cluster. What could be the issue? +Verify that the Client VPN endpoint has active routes to the EKS VPC CIDR and that these routes are associated with subnets attached to the Client VPN endpoint. Confirm that authorization rules permit access to the EKS VPC CIDR. Ensure that security groups associated with the Client VPN endpoint allow outbound traffic to the EKS VPC. After making changes, disconnect and reconnect VPN clients to receive updated routes. -EKS cluster networking is complex. There are many issues the could cause this problem, so in our experience we recommend the AWS Reachability Analyzer. This tool can help you diagnose the issue by testing the network path between the source and destination. Make sure to test both directions. +## Common troubleshooting steps when unable to connect to EKS cluster -For example, we have found misconfigurations where the Security Group was not allowing traffic from the worker nodes to the EKS cluster. Or Transit Gateway was missing an account attachment. Or a subnet missing any given route. In all of these cases, the Reachability Analyzer exposes the issue. +1. Check EKS Cluster Security Groups: Ensure that inbound and outbound rules allow necessary traffic. +2. Verify Network ACLs: Confirm that Network ACLs permit the required inbound and outbound traffic. +3. Inspect Subnet Route Tables: Ensure that VPC route tables correctly route traffic between your source and the EKS cluster. +4. Confirm Transit Gateway Configuration: Verify that Transit Gateway attachments and route tables are properly set up. +5. Verify DNS Resolution: Check that the EKS API endpoint’s DNS name resolves correctly from your source. +6. *Use AWS Reachability Analyzer*: Analyze the network path to identify any connectivity issues. +7. Review EKS Cluster Endpoint Access Settings: Make sure the cluster’s endpoint access configuration aligns with your needs. +8. Check the EKS Cluster Subnets: Ensure that the EKS cluster subnets are correctly configured and associated with the cluster. We recommend using private subnets for managed nodes. +9. Check IAM Permissions: Ensure your IAM user or role has the necessary permissions to access the cluster. -However, one particular issue we had to debug was related to a misconfiguration with subnet selection for managed nodes. Typically we set the EKS cluster to use private subnets for the managed nodes, with `cluster_private_subnets_only: true`. However, if this is not set, the managed nodes may choose public subnets in addition to private subnets. This can cause the cluster's control plane to be reachable by ping, but not properly configured nor accessible. +For example, here's an example command to test connectivity to the EKS cluster's control plane endpoint. You can find this endpoint in the AWS web console or in Terraform outputs: -Make sure to check the subnet selection for the managed nodes in the EKS cluster configuration. +```bash +curl -fsSk --max-time 5 "$url/healthz" https://82F58026XXXXXXXXXXXXXXXXXXXXXXXX.gr7.us-east-1.eks.amazonaws.com +``` From d0756baeb607f10277cd90b198c8c6975c20149b Mon Sep 17 00:00:00 2001 From: milldr Date: Fri, 22 Nov 2024 16:33:52 -0500 Subject: [PATCH 4/8] Update curl command to test EKS cluster endpoint --- docs/layers/eks/faq.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/layers/eks/faq.mdx b/docs/layers/eks/faq.mdx index c3d3375ac..5b534ddd8 100644 --- a/docs/layers/eks/faq.mdx +++ b/docs/layers/eks/faq.mdx @@ -76,5 +76,5 @@ Verify that the Client VPN endpoint has active routes to the EKS VPC CIDR and th For example, here's an example command to test connectivity to the EKS cluster's control plane endpoint. You can find this endpoint in the AWS web console or in Terraform outputs: ```bash -curl -fsSk --max-time 5 "$url/healthz" https://82F58026XXXXXXXXXXXXXXXXXXXXXXXX.gr7.us-east-1.eks.amazonaws.com +curl -fsSk --max-time 5 "https://82F58026XXXXXXXXXXXXXXXXXXXXXXXX.gr7.us-east-1.eks.amazonaws.com/healthz" ``` From d95ba94b0ff44e78641da0bff5029b56f12bfe4f Mon Sep 17 00:00:00 2001 From: milldr Date: Fri, 22 Nov 2024 16:35:09 -0500 Subject: [PATCH 5/8] Update AWS Reachability Analyzer for VPN and EKS connectivity --- docs/layers/eks/faq.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/layers/eks/faq.mdx b/docs/layers/eks/faq.mdx index 5b534ddd8..6c32b09de 100644 --- a/docs/layers/eks/faq.mdx +++ b/docs/layers/eks/faq.mdx @@ -68,7 +68,7 @@ Verify that the Client VPN endpoint has active routes to the EKS VPC CIDR and th 3. Inspect Subnet Route Tables: Ensure that VPC route tables correctly route traffic between your source and the EKS cluster. 4. Confirm Transit Gateway Configuration: Verify that Transit Gateway attachments and route tables are properly set up. 5. Verify DNS Resolution: Check that the EKS API endpoint’s DNS name resolves correctly from your source. -6. *Use AWS Reachability Analyzer*: Analyze the network path to identify any connectivity issues. +6. *Use AWS Reachability Analyzer*: Analyze the network path to identify any connectivity issues. Set the VPNs ENI as the source and the EKS cluster endpoint private IP as the destination. _Check both directions_. 7. Review EKS Cluster Endpoint Access Settings: Make sure the cluster’s endpoint access configuration aligns with your needs. 8. Check the EKS Cluster Subnets: Ensure that the EKS cluster subnets are correctly configured and associated with the cluster. We recommend using private subnets for managed nodes. 9. Check IAM Permissions: Ensure your IAM user or role has the necessary permissions to access the cluster. From efcb1b5cec53f0b2ef1d5d61006e035a98951547 Mon Sep 17 00:00:00 2001 From: milldr Date: Wed, 4 Dec 2024 12:27:24 -0500 Subject: [PATCH 6/8] improvements to cold start and eks faq --- docs/layers/accounts/account-baseline.mdx | 86 +++++++++++++++++- docs/layers/accounts/deploy-accounts.mdx | 8 +- .../accounts/prepare-aws-organization.mdx | 7 +- docs/layers/eks/faq.mdx | 89 +++++++++++++------ package-lock.json | 7 +- 5 files changed, 158 insertions(+), 39 deletions(-) diff --git a/docs/layers/accounts/account-baseline.mdx b/docs/layers/accounts/account-baseline.mdx index fa28871a9..055300d18 100644 --- a/docs/layers/accounts/account-baseline.mdx +++ b/docs/layers/accounts/account-baseline.mdx @@ -1,7 +1,7 @@ --- title: "Deploy CloudTrail and ECR" sidebar_label: "Deploy Account Baseline" -sidebar_position: 4 +sidebar_position: 5 --- import Intro from '@site/src/components/Intro'; import KeyPoints from '@site/src/components/KeyPoints'; @@ -26,4 +26,88 @@ Now that all the accounts have been deployed, we need to finalize the setup of t + + + ## (Optional) Deploy Account Budgets + + Budgets are an optional feature that can be enabled with [the `account-settings` component](/components/library/aws/account-settings/) for the Organization as a whole or for individual accounts. Budgets *do not restrict spending* but provide visibility into spending and can be used to set alerts when spending exceeds a certain threshold. We recommend using a dedicated Slack channel for these alerts, which we will set up with a webhook. + + + - [ ] [Create a Slack Webhook](https://api.slack.com/messaging/webhooks). Take note of the Webhook URL and the final name of the Slack channel. The Slack channel is case-sensitive and needs to match the name of the channel exactly as the name appears in owning Slack server (not the name if changed as a shared channel). + - [ ] Update the `account-settings` component with the Slack Webhook URL and the Slack channel name. + ```yaml + components: + terraform: + account-settings: + vars: + budgets_enabled: true + budgets_notifications_enabled: true + budgets_slack_webhook_url: https://url.slack.com/abcd/1234 + budgets_slack_username: AWS Budgets + budgets_slack_channel: aws-budgets-notifications + ``` + - [ ] To enable budgets for the entire organization, update `account-settings` in the same account as the Organization root account, typically `core-root`. This budget will include the total spending of all accounts in the Organization. + ```yaml + # stacks/orgs/acme/core/root/global-region/baseline.yaml + import: + - catalog/account-settings + + components: + terraform: + account-settings: + vars: + # Budgets in `root` apply to the Organization as a whole + budgets: + - name: Total AWS Organization Cost per Month + budget_type: COST + limit_amount: 10000 + limit_unit: USD + time_unit: MONTHLY + notification: + - comparison_operator: GREATER_THAN + notification_type: FORECASTED + threshold_type: PERCENTAGE + threshold: 80 + subscribers: + - slack + - comparison_operator: GREATER_THAN + notification_type: FORECASTED + threshold_type: PERCENTAGE + threshold: 100 + subscribers: + - slack + - comparison_operator: GREATER_THAN + notification_type: ACTUAL + threshold_type: PERCENTAGE + threshold: 100 + subscribers: + - slack + ``` + - [ ] To enable budgets for individual accounts, update `account-settings` in the account you want to enable budgets for or as the default setting for all `account-settings` components to apply to every account. This budget will include the spending of the given account only. + ```yaml + # stacks/catalog/account-settings.yaml + components: + terraform: + account-settings: + vars: + ... + budgets: + - name: 1000-total-monthly + budget_type: COST + limit_amount: "1000" + limit_unit: USD + time_unit: MONTHLY + - name: s3-3GB-limit-monthly + budget_type: USAGE + limit_amount: "3" + limit_unit: GB + time_unit: MONTHLY + ``` + + - [ ] Finally, reapply `account-settings` in any changed account to apply the new settings + + + + + diff --git a/docs/layers/accounts/deploy-accounts.mdx b/docs/layers/accounts/deploy-accounts.mdx index fcdd9f0c2..4fcd8b31b 100644 --- a/docs/layers/accounts/deploy-accounts.mdx +++ b/docs/layers/accounts/deploy-accounts.mdx @@ -48,17 +48,15 @@ This step-by-step process outlines how to deploy AWS accounts using `atmos` work - ## Configure Root Account as Organization + ## Confirm the Root Account is configured as an Organization - Before performing the "Deploy Accounts" step, the root account needs to be configured as an AWS Organization. - - This process also enables [AWS RAM for Organizations](https://docs.aws.amazon.com/organizations/latest/userguide/orgs_manage_policies_enable-ram.html) via a CLI command, which is required for connecting the Organization. + The previous step will create the AWS Organization and configure the `core-root` account as the "root" account. Take the time now to verify that the root account is configured as an AWS Organization. and that [AWS RAM for Organizations](https://docs.aws.amazon.com/organizations/latest/userguide/orgs_manage_policies_enable-ram.html) is enabled, which is required for connecting the Organization. ## Raise Account Limits - To deploy all accounts, we need to request an increase of the Account Quota from AWS support, which requires an AWS Organization to be created first. + If you haven't already completed the Account Quota increase, now is the time to do so. To deploy all accounts, we need to request an increase of the Account Quota from AWS support, which requires an AWS Organization to be created first. From the `root` account (not `SuperAdmin`), increase the [account quota to 20+](https://us-east-1.console.aws.amazon.com/servicequotas/home/services/organizations/quotas) for the Cloud Posse reference architecture, or more depending on your business use-case diff --git a/docs/layers/accounts/prepare-aws-organization.mdx b/docs/layers/accounts/prepare-aws-organization.mdx index cd74382e8..951a35afc 100644 --- a/docs/layers/accounts/prepare-aws-organization.mdx +++ b/docs/layers/accounts/prepare-aws-organization.mdx @@ -55,8 +55,13 @@ From the root account: For billing users, you need to enable IAM access. As the root user [open up the account settings for AWS Billing](https://us-east-1.console.aws.amazon.com/billing/home?region=us-east-1#/Account), then scroll to the section "IAM user and role access to Billing information" and enable it. -3. ### Enable Regions (Optional) +1. ### Enable Regions (Optional) The 17 original AWS regions are enabled by default. If you are using a region that is not enabled by default (such as Middle East/Bahrain), you need to take extra steps. For details, see [the detailed documentation](/layers/accounts/tutorials/manual-configuration/#optional-enable-regions) +1. ### Prepare for Account Quota Increase + In order to deploy all accounts, you need to request an increase of the Account Quota from AWS support. This requires an AWS Organization to be created first, which we will create with Terraform in the [Deploy Accounts guide](/layers/accounts/deploy-accounts/#-prepare-account-deployment). This request can take a few days to process, so it's important to get it started early so that it doesn't become a blocker. + + At this time we don't need to request the increase, but we should be prepared to do so as soon as the AWS Organization is created. + For more details, see diff --git a/docs/layers/eks/faq.mdx b/docs/layers/eks/faq.mdx index 6c32b09de..654d25db9 100644 --- a/docs/layers/eks/faq.mdx +++ b/docs/layers/eks/faq.mdx @@ -39,42 +39,77 @@ launch and scale runners for GitHub automatically. For more on how to set up ARC, see the [GitHub Action Runners setup docs for EKS](/layers/github-actions/eks-github-actions-controller/). -## Managed nodes are successfully launching, but worker nodes are not joining the cluster +## Common Connectivity Issues and Solutions -Worker nodes are not joining the EKS cluster even though managed nodes are successfully launching. This often happens when worker nodes cannot communicate with the EKS cluster due to missing cluster add-ons. +If you're having trouble connecting to your EKS cluster, follow these comprehensive steps to diagnose and resolve the issue: -Ensure that cluster add-ons compatible with your EKS cluster version are properly configured and included in your stack. Verify that the addon stack file (e.g., `stacks/catalog/eks/mixins/k8s-1-29.yaml`) is imported into your stack. You can confirm this by checking the final rendered component stack with Atmos: + +**1. Test Basic Connectivity** + +First, test basic connectivity to your cluster endpoint. This helps isolate whether the issue is with basic network connectivity or something more specific: ```bash -atmos describe component eks/cluster -s +curl -fsSk --max-time 5 "https://CLUSTER_ENDPOINT/healthz" ``` -## I'm able to ping the cluster endpoint but unable to connect to the cluster - -You can ping the EKS cluster endpoint but cannot connect to it using `kubectl` or other tools. This indicates a networking issue preventing proper communication with the cluster. - -Use the AWS Reachability Analyzer to diagnose the network path between your source and the EKS endpoint. Check for misconfigurations in security groups, Transit Gateway attachments, and subnet routes. Ensure that managed nodes are using private subnets by setting `cluster_private_subnets_only: true` in your EKS cluster configuration. - -## AWS Client VPN clients not receiving routes to EKS cluster +If these tests fail, it indicates a fundamental connectivity issue that needs to be addressed before proceeding to more specific troubleshooting. -VPN clients connected via AWS Client VPN are not receiving routes to the EKS cluster’s VPC, preventing access to the API endpoint. +**2. Check Node Communication** -Verify that the Client VPN endpoint has active routes to the EKS VPC CIDR and that these routes are associated with subnets attached to the Client VPN endpoint. Confirm that authorization rules permit access to the EKS VPC CIDR. Ensure that security groups associated with the Client VPN endpoint allow outbound traffic to the EKS VPC. After making changes, disconnect and reconnect VPN clients to receive updated routes. +If worker nodes aren't joining the cluster, follow these detailed steps: -## Common troubleshooting steps when unable to connect to EKS cluster - -1. Check EKS Cluster Security Groups: Ensure that inbound and outbound rules allow necessary traffic. -2. Verify Network ACLs: Confirm that Network ACLs permit the required inbound and outbound traffic. -3. Inspect Subnet Route Tables: Ensure that VPC route tables correctly route traffic between your source and the EKS cluster. -4. Confirm Transit Gateway Configuration: Verify that Transit Gateway attachments and route tables are properly set up. -5. Verify DNS Resolution: Check that the EKS API endpoint’s DNS name resolves correctly from your source. -6. *Use AWS Reachability Analyzer*: Analyze the network path to identify any connectivity issues. Set the VPNs ENI as the source and the EKS cluster endpoint private IP as the destination. _Check both directions_. -7. Review EKS Cluster Endpoint Access Settings: Make sure the cluster’s endpoint access configuration aligns with your needs. -8. Check the EKS Cluster Subnets: Ensure that the EKS cluster subnets are correctly configured and associated with the cluster. We recommend using private subnets for managed nodes. -9. Check IAM Permissions: Ensure your IAM user or role has the necessary permissions to access the cluster. - -For example, here's an example command to test connectivity to the EKS cluster's control plane endpoint. You can find this endpoint in the AWS web console or in Terraform outputs: +- Verify that the addon stack file (e.g., `stacks/catalog/eks/mixins/k8s-1-29.yaml`) is imported into your stack. +- Verify cluster add-ons are properly configured for your EKS version. + - Check CoreDNS is running + - Verify kube-proxy is deployed + - Ensure VPC CNI is correctly configured +- Confirm the rendered component stack configuration. ```bash -curl -fsSk --max-time 5 "https://82F58026XXXXXXXXXXXXXXXXXXXXXXXX.gr7.us-east-1.eks.amazonaws.com/healthz" +atmos describe component eks/cluster -s ``` + +**3. Verify Network Configuration** + +- Security Groups: + - Control plane security group must allow port 443 inbound from worker nodes + - Worker node security group must allow all traffic between nodes + - Verify outbound internet access for pulling container images +- Subnet Routes: + - Verify route tables have paths to all required destinations + - Check for conflicting or overlapping CIDR ranges + - Ensure NAT Gateway is properly configured for private subnets +- Transit Gateway: + - Verify TGW attachments are active and associated + - Check TGW route tables for correct propagation + - Confirm cross-account routing if applicable +- Private Subnets Configuration: + - Set `cluster_private_subnets_only: true` in your configuration + - Ensure private subnets have proper NAT Gateway routing + +**4. VPN Connectivity** + +When accessing via AWS Client VPN, verify these configurations: + +- VPN Routes: + - Check route table entries for EKS VPC CIDR + - Verify routes are active and not in pending state + - Confirm no conflicting routes exist +- Subnet Associations: + - Ensure VPN endpoint is associated with correct subnets + - Verify subnet route tables include VPN CIDR range +- Authorization Rules: + - Check network ACLs allow VPN CIDR range + - Verify security group rules permit VPN traffic + - Confirm IAM roles have necessary permissions + +After making any changes, have clients disconnect and reconnect to receive updated routes. + +**5. Advanced Diagnostics** + +- AWS Reachability Analyzer: + - Enable cross-account analysis for VPC peering or TGW connections + - Test from VPN ENI to cluster endpoint + - Test return path from cluster to VPN ENI + + diff --git a/package-lock.json b/package-lock.json index 3f666001f..8d20ec124 100644 --- a/package-lock.json +++ b/package-lock.json @@ -6428,8 +6428,8 @@ "license": "MIT" }, "node_modules/custom-loaders": { - "resolved": "plugins/custom-loaders", - "link": true + "version": "0.0.0", + "resolved": "file:plugins/custom-loaders" }, "node_modules/cytoscape": { "version": "3.30.1", @@ -18860,9 +18860,6 @@ "type": "github", "url": "https://github.com/sponsors/wooorm" } - }, - "plugins/custom-loaders": { - "version": "0.0.0" } } } From 136a8ce4a074ab60bd3a88077867f68c545f5493 Mon Sep 17 00:00:00 2001 From: milldr Date: Wed, 4 Dec 2024 13:07:36 -0500 Subject: [PATCH 7/8] formatting steps --- docs/layers/accounts/account-baseline.mdx | 146 +++++++++++----------- 1 file changed, 72 insertions(+), 74 deletions(-) diff --git a/docs/layers/accounts/account-baseline.mdx b/docs/layers/accounts/account-baseline.mdx index 055300d18..965116130 100644 --- a/docs/layers/accounts/account-baseline.mdx +++ b/docs/layers/accounts/account-baseline.mdx @@ -33,81 +33,79 @@ Now that all the accounts have been deployed, we need to finalize the setup of t Budgets are an optional feature that can be enabled with [the `account-settings` component](/components/library/aws/account-settings/) for the Organization as a whole or for individual accounts. Budgets *do not restrict spending* but provide visibility into spending and can be used to set alerts when spending exceeds a certain threshold. We recommend using a dedicated Slack channel for these alerts, which we will set up with a webhook. - - [ ] [Create a Slack Webhook](https://api.slack.com/messaging/webhooks). Take note of the Webhook URL and the final name of the Slack channel. The Slack channel is case-sensitive and needs to match the name of the channel exactly as the name appears in owning Slack server (not the name if changed as a shared channel). - - [ ] Update the `account-settings` component with the Slack Webhook URL and the Slack channel name. - ```yaml - components: - terraform: - account-settings: - vars: - budgets_enabled: true - budgets_notifications_enabled: true - budgets_slack_webhook_url: https://url.slack.com/abcd/1234 - budgets_slack_username: AWS Budgets - budgets_slack_channel: aws-budgets-notifications - ``` - - [ ] To enable budgets for the entire organization, update `account-settings` in the same account as the Organization root account, typically `core-root`. This budget will include the total spending of all accounts in the Organization. - ```yaml - # stacks/orgs/acme/core/root/global-region/baseline.yaml - import: - - catalog/account-settings + 1. [Create a Slack Webhook](https://api.slack.com/messaging/webhooks). Take note of the Webhook URL and the final name of the Slack channel. The Slack channel is case-sensitive and needs to match the name of the channel exactly as the name appears in owning Slack server (not the name if changed as a shared channel). + 2. Update the `account-settings` component with the Slack Webhook URL and the Slack channel name. + ```yaml + # stacks/catalog/account-settings.yaml + components: + terraform: + account-settings: + vars: + budgets_enabled: true + budgets_notifications_enabled: true + budgets_slack_webhook_url: https://url.slack.com/abcd/1234 + budgets_slack_username: AWS Budgets + budgets_slack_channel: aws-budgets-notifications + ``` + 3. **To enable budgets for the entire organization**, update `account-settings` in the same account as the Organization root account, typically `core-root`. This budget will include the total spending of all accounts in the Organization. + ```yaml + # stacks/orgs/acme/core/root/global-region/baseline.yaml + import: + - catalog/account-settings - components: - terraform: - account-settings: - vars: - # Budgets in `root` apply to the Organization as a whole - budgets: - - name: Total AWS Organization Cost per Month - budget_type: COST - limit_amount: 10000 - limit_unit: USD - time_unit: MONTHLY - notification: - - comparison_operator: GREATER_THAN - notification_type: FORECASTED - threshold_type: PERCENTAGE - threshold: 80 - subscribers: - - slack - - comparison_operator: GREATER_THAN - notification_type: FORECASTED - threshold_type: PERCENTAGE - threshold: 100 - subscribers: - - slack - - comparison_operator: GREATER_THAN - notification_type: ACTUAL - threshold_type: PERCENTAGE - threshold: 100 - subscribers: - - slack - ``` - - [ ] To enable budgets for individual accounts, update `account-settings` in the account you want to enable budgets for or as the default setting for all `account-settings` components to apply to every account. This budget will include the spending of the given account only. - ```yaml - # stacks/catalog/account-settings.yaml - components: - terraform: - account-settings: - vars: - ... - budgets: - - name: 1000-total-monthly - budget_type: COST - limit_amount: "1000" - limit_unit: USD - time_unit: MONTHLY - - name: s3-3GB-limit-monthly - budget_type: USAGE - limit_amount: "3" - limit_unit: GB - time_unit: MONTHLY - ``` - - - [ ] Finally, reapply `account-settings` in any changed account to apply the new settings - - + components: + terraform: + account-settings: + vars: + # Budgets in `root` apply to the Organization as a whole + budgets: + - name: Total AWS Organization Cost per Month + budget_type: COST + limit_amount: 10000 + limit_unit: USD + time_unit: MONTHLY + notification: + - comparison_operator: GREATER_THAN + notification_type: FORECASTED + threshold_type: PERCENTAGE + threshold: 80 + subscribers: + - slack + - comparison_operator: GREATER_THAN + notification_type: FORECASTED + threshold_type: PERCENTAGE + threshold: 100 + subscribers: + - slack + - comparison_operator: GREATER_THAN + notification_type: ACTUAL + threshold_type: PERCENTAGE + threshold: 100 + subscribers: + - slack + ``` + 4. **To enable budgets for individual accounts**, update `account-settings` in the account you want to enable budgets for or as the default setting for all `account-settings` components to apply to every account. This budget will include the spending of the given account only. + ```yaml + # stacks/catalog/account-settings.yaml + components: + terraform: + account-settings: + vars: + ... + budgets: + - name: 1000-total-monthly + budget_type: COST + limit_amount: "1000" + limit_unit: USD + time_unit: MONTHLY + - name: s3-3GB-limit-monthly + budget_type: USAGE + limit_amount: "3" + limit_unit: GB + time_unit: MONTHLY + ``` + 5. Finally, reapply `account-settings` in any changed account to apply the new settings + - From 23671a031f35dbebb26bb80db42947a041ce44a9 Mon Sep 17 00:00:00 2001 From: milldr Date: Wed, 4 Dec 2024 14:52:31 -0500 Subject: [PATCH 8/8] add budget step to table --- docs/layers/accounts/account-baseline.mdx | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/layers/accounts/account-baseline.mdx b/docs/layers/accounts/account-baseline.mdx index 965116130..4be7936a4 100644 --- a/docs/layers/accounts/account-baseline.mdx +++ b/docs/layers/accounts/account-baseline.mdx @@ -17,6 +17,7 @@ Now that all the accounts have been deployed, we need to finalize the setup of t | Steps | Actions | | -------------------------- | ----------------------------------- | | Deploy baseline components | `atmos workflow deploy -f baseline` | +| Deploy account budgets | Create Slack Webhook and `atmos workflow deploy -f accounts` |