Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(flux): use the operator for bootstrapping and managing flux's components #632

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Smana
Copy link
Owner

@Smana Smana commented Dec 22, 2024

PR Type

Enhancement, Documentation


Description

  • Integrated the Flux operator for managing Flux components, replacing the previous flux_bootstrap_git resource.
  • Added new HelmRelease, ConfigMap, and Kubernetes resources for the Flux operator and its configuration.
  • Introduced monitoring resources (VMPodScrape and VMServiceScrape) for Flux components and the operator.
  • Updated Terraform variables and documentation to reflect the new Flux operator setup.
  • Removed deprecated Flux and GitHub providers and associated configurations.

Changes walkthrough 📝

Relevant files
Enhancement
13 files
data.tf
Introduced data source for Flux manifests.                             

opentofu/eks/data.tf

  • Added a new kubectl_filename_list data source for Flux manifests.
  • +4/-0     
    flux.tf
    Removed Flux bootstrap and variable ConfigMap resources. 

    opentofu/eks/flux.tf

  • Removed the flux_bootstrap_git resource.
  • Removed the kubernetes_config_map resource for Flux variables.
  • +0/-30   
    helm.tf
    Added Helm release for Flux operator.                                       

    opentofu/eks/helm.tf

    • Added a new Helm release for the Flux operator.
    +16/-0   
    kubernetes.tf
    Added Flux and Karpenter manifests and secret resources. 

    opentofu/eks/kubernetes.tf

  • Added kubectl_manifest resources for Flux and Karpenter manifests.
  • Added a Kubernetes secret for Flux system credentials.
  • +66/-0   
    variables.tf
    Updated variables for Flux operator and SSM.                         

    opentofu/eks/variables.tf

  • Renamed ssm_enabled to enable_ssm.
  • Added new variables for Flux operator configuration.
  • +18/-11 
    controllers-vmpodscrape.yaml
    Introduced monitoring for Flux controllers.                           

    flux/observability/controllers-vmpodscrape.yaml

  • Added a new VMPodScrape resource for monitoring Flux controllers.
  • +23/-1   
    operator-vmservicescrape.yaml
    Added monitoring for Flux operator.                                           

    flux/observability/operator-vmservicescrape.yaml

  • Added a VMServiceScrape resource for monitoring the Flux operator.
  • +19/-0   
    helmrelease.yaml
    Added HelmRelease for Flux operator.                                         

    flux/operator/helmrelease.yaml

    • Added a HelmRelease resource for the Flux operator.
    +11/-0   
    ocirepo-flux-operator.yaml
    Added OCIRepository for Flux operator.                                     

    flux/sources/ocirepo-flux-operator.yaml

    • Added an OCIRepository resource for the Flux operator.
    +10/-0   
    cluster-vars-configmap.yaml
    Added ConfigMap for Flux cluster variables.                           

    opentofu/eks/kubernetes-manifests/flux/cluster-vars-configmap.yaml

    • Added a ConfigMap for storing Flux cluster variables.
    +16/-0   
    instance.yaml
    Added FluxInstance resource for Flux management.                 

    opentofu/eks/kubernetes-manifests/flux/instance.yaml

    • Added a FluxInstance resource for managing Flux components.
    +31/-0   
    providers.tf
    Removed deprecated Flux and GitHub providers.                       

    opentofu/eks/providers.tf

    • Removed the flux and github providers.
    +0/-22   
    eks.go
    Updated Terraform variable handling for Flux.                       

    dagger/eks.go

    • Updated Terraform variable references for Flux Git branch/tag.
    +2/-2     
    Documentation
    1 files
    README.md
    Updated README for Flux operator integration.                       

    opentofu/eks/README.md

  • Updated documentation to reflect changes in Flux configuration.
  • Removed references to deprecated GitHub variables.
  • +12/-10 

    💡 PR-Agent usage: Comment /help "your question" on any pull request to receive relevant information

    Copy link
    Contributor

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
    🧪 No relevant tests
    🔒 Security concerns

    Sensitive information exposure:
    The kubernetes_secret resource for Flux includes a GitHub token retrieved from AWS Secrets Manager. Ensure that the secret is securely managed and access is restricted to authorized entities only.

    ⚡ Recommended focus areas for review

    Possible Misconfiguration

    The kubectl_manifest resource for Flux includes a dependency on helm_release.flux-operator. Ensure that the Helm release is correctly configured and deployed before applying the Flux manifests to avoid runtime errors.

    resource "kubectl_manifest" "flux" {
      for_each = { for file_name in data.kubectl_filename_list.flux.matches : file_name => file_name }
      yaml_body = templatefile(each.key,
        {
          flux_operator_version               = var.flux_operator_version
          enable_flux_image_update_automation = var.enable_flux_image_update_automation
          repository_sync_url                 = var.flux_sync_repository_url
          git_ref                             = var.flux_git_ref
          cluster_name                        = var.cluster_name
          oidc_provider_arn                   = module.eks.oidc_provider_arn
          oidc_issuer_url                     = module.eks.cluster_oidc_issuer_url
          oidc_issuer_host                    = replace(module.eks.cluster_oidc_issuer_url, "https://", "")
          aws_account_id                      = data.aws_caller_identity.this.account_id
          region                              = var.region
          environment                         = var.env
          vpc_id                              = data.aws_vpc.selected.id
          vpc_cidr_block                      = data.aws_vpc.selected.cidr_block
        }
      )
    
      depends_on = [helm_release.flux-operator]
    }
    Monitoring Configuration

    The VMPodScrape resource targets multiple controllers. Verify that the matchExpressions and podMetricsEndpoints are correctly configured to capture the desired metrics without missing or redundant data.

    apiVersion: operator.victoriametrics.com/v1beta1
    kind: VMPodScrape
    metadata:
      name: flux-system
      namespace: flux-system
    spec:
      namespaceSelector:
        matchNames:
          - flux-system
      selector:
        matchExpressions:
          - key: app
            operator: In
            values:
              - helm-controller
              - source-controller
              - kustomize-controller
              - notification-controller
              - image-automation-controller
              - image-reflector-controller
      podMetricsEndpoints:
        - targetPort: http-prom
    Conditional Logic in Manifests

    The FluxInstance manifest uses conditional logic for enabling image update automation. Ensure that the templating syntax and conditions are correctly evaluated during deployment.

    %{ if enable_flux_image_update_automation ~}
        - image-reflector-controller
        - image-automation-controller
    %{ endif ~}

    Copy link
    Contributor

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Score
    Possible issue
    Add validation for the branch variable to ensure it is not empty before usage

    Add error handling to validate the branch variable before using it in the tfRun
    function to prevent potential issues with invalid branch names.

    dagger/eks.go [13]

    +if branch == "" {
    +    return nil, fmt.Errorf("branch name cannot be empty")
    +}
     output, err := tfRun(ctx, ctr, workDir, tfarg, []string{"-var-file", "variables.tfvars", "-var", fmt.Sprintf("flux_git_ref=refs/heads/%s", branch)})
    Suggestion importance[1-10]: 9

    Why: Adding validation for the branch variable is a significant improvement as it prevents potential runtime errors caused by an empty branch name. The suggestion is precise and directly addresses a potential issue in the PR.

    9
    Add validation to handle missing or invalid variables in the templatefile function to prevent runtime errors

    Ensure that the templatefile function in the kubectl_manifest resources properly
    handles missing or invalid variables to avoid runtime errors.

    opentofu/eks/kubernetes.tf [38-55]

     yaml_body = templatefile(each.key,
       {
         flux_operator_version               = var.flux_operator_version
         enable_flux_image_update_automation = var.enable_flux_image_update_automation
         repository_sync_url                 = var.flux_sync_repository_url
         git_ref                             = var.flux_git_ref
         cluster_name                        = var.cluster_name
         oidc_provider_arn                   = module.eks.oidc_provider_arn
         oidc_issuer_url                     = module.eks.cluster_oidc_issuer_url
         oidc_issuer_host                    = replace(module.eks.cluster_oidc_issuer_url, "https://", "")
         aws_account_id                      = data.aws_caller_identity.this.account_id
         region                              = var.region
         environment                         = var.env
         vpc_id                              = data.aws_vpc.selected.id
         vpc_cidr_block                      = data.aws_vpc.selected.cidr_block
       }
    -)
    +) if all([var.flux_operator_version, var.flux_sync_repository_url, var.flux_git_ref, var.cluster_name]) else null
    Suggestion importance[1-10]: 8

    Why: Adding validation for missing or invalid variables in the templatefile function is crucial to prevent runtime errors, especially in infrastructure code where such errors can lead to deployment failures. The suggestion is accurate and directly addresses a potential issue in the PR.

    8
    General
    Validate that the pod labels match the matchExpressions selector to ensure accurate metric scraping

    Verify that the matchExpressions selector keys and values align with the actual
    labels applied to the pods to ensure proper scraping.

    flux/observability/controllers-vmpodscrape.yaml [10-21]

     selector:
       matchExpressions:
         - key: app
           operator: In
           values:
             - helm-controller
             - source-controller
             - kustomize-controller
             - notification-controller
             - image-automation-controller
             - image-reflector-controller
    +# Ensure these labels are correctly applied to the relevant pods
    Suggestion importance[1-10]: 6

    Why: Ensuring that the matchExpressions selector aligns with actual pod labels is important for accurate metric scraping. However, the suggestion is more of a verification step rather than a direct code improvement, which slightly reduces its impact.

    6
    Validate the conditional inclusion of components to avoid configuration errors

    Ensure that the conditional logic for including components like
    image-reflector-controller and image-automation-controller is correctly evaluated to
    avoid misconfigurations.

    opentofu/eks/kubernetes-manifests/flux/instance.yaml [16-19]

     %{
       if enable_flux_image_update_automation ~}
         - image-reflector-controller
         - image-automation-controller
     %{ endif ~}
    +# Confirm that `enable_flux_image_update_automation` is properly passed and evaluated
    Suggestion importance[1-10]: 5

    Why: While the suggestion to validate the conditional logic is valid, it does not propose a specific code change or improvement. Instead, it asks for confirmation, which reduces its actionable value.

    5

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    2 participants