-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add wiz manifests #8679
base: dev
Are you sure you want to change the base?
feat: add wiz manifests #8679
Conversation
👍 |
We should also populate the deletions.yaml. Otherwise, the ConfigItem cannot be used to turn it off again. |
allowPrivilegeEscalation: false | ||
runAsNonRoot: true | ||
runAsUser: 1000 | ||
image: "wiziopublic.azurecr.io/wiz-app/wiz-broker:2.5" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In our environment all images need to come from container-registry.zalando.net
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably need to create an internal repository that "republishes" this public image to our internal ECR registry and then use the link as Martin suggests here. I communicated in our internal chat thread.
"helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded | ||
rollme.wizApiTokenHash: ce8124bc1b0fbc0cb5cd47338ca0c7d5f5446d79936e443a201d96b192a7bd65 | ||
rollme.proxyHash: 9aa53d69075371b3fa23ebeea2fd2416ea81fb533499d071ca2d576f17c7c886 | ||
rollme.brokerHash: 115ba85431eeaf8db3ff2173aee02d16e67df1555d5e1ef74cfa7ac0d812cab2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These values look like they are computed based on the configuration value in order to trigger an update when they change. Having these values static here doesn't provide any value.
We do have similar functionality which can be used if it's really needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I commented the same on the rollme
annotation on the connector deployment manifest above. It appears this is used to keep track of changes in the deployment manifests, not in the configuration. So, we probably don't need this whole setup anyway. Maybe it's a helm thing? CLM does this job for us.
type: Opaque | ||
stringData: | ||
clientId: "{{ .Cluster.ConfigItems.wiz_api_client_id | base64 }}" | ||
clientToken: "{{ .Cluster.ConfigItems.wiz_api_client_token | base64 }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When using stringData
, then base64
encoding might not be needed, just fyi if you see issues. https://kubernetes.io/docs/concepts/configuration/secret/#restriction-names-data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vinaythupili now that I think about it, the values are already base64 encoded before the encryption is done. I think we meant to do base64Decode
here originally, but since the secret can be base64 anyway, we probably don't need this encoding and can just use the data
field instead of stringData
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated to data 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need the base64
template function, now that we're using the data
field that's already base64 encoded when the secret object will be stored? (That's what I understand from Zak's comment above)
updated 👍 |
Signed-off-by: Katyanna Moura <[email protected]>
- name: wiz-cluster-reader | ||
kind : ClusterRoleBinding | ||
namespace: wiz | ||
{{- end }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Missing empty line at the end 😅
Trigger deletions when daemonset is disabled. Co-authored-by: Zak Lawrence A <[email protected]>
# apiVersion: rbac.authorization.k8s.io/v1 | ||
# kind: ClusterRole | ||
# metadata: | ||
# name: wiz-cluster-reader | ||
# labels: | ||
# helm.sh/chart: wiz-kubernetes-connector-3.1.1 | ||
# app.kubernetes.io/name: wiz-kubernetes-connector | ||
# app.kubernetes.io/instance: wiz-connector | ||
# app.kubernetes.io/version: "2.5" | ||
# app.kubernetes.io/managed-by: Helm | ||
# rules: | ||
# - apiGroups: ["*"] | ||
# resources: ["*"] | ||
# verbs: ["get", "list", "watch"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to keep this manifest here? I think the comment is sufficient to communicate that we don't need a global reader role 🤔
labels: | ||
helm.sh/chart: wiz-kubernetes-connector-3.1.1 | ||
app.kubernetes.io/name: wiz-kubernetes-connector | ||
app.kubernetes.io/instance: wiz-connector | ||
app.kubernetes.io/version: "2.5" | ||
app.kubernetes.io/managed-by: Helm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these labels important for the Wiz deployment? I think we might need to remove the ones related to helm
, also we need to provide the Zalando environment labels like application: foo
and component: bar
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see a similar concern was raised by Martin in another manifest, so let's fix the labels for all manifests 🙂
- Drop the helm related labels.
- Add
application
andcomponent
labels.
runAsNonRoot: true | ||
runAsUser: 1000 | ||
image: "wiziopublic.azurecr.io/wiz-app/wiz-broker:2.5" | ||
imagePullPolicy: Always |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imagePullPolicy
should be IfNotPresent
.
Images pulled are always cached by the runtime and reusing the cached image also results in faster pod startup times. This is only set to Always
in cases where you expect frequent image updates, which is not the case for our infrastructure.
- name: WIZ_ENV | ||
value: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to create an empty environment variable?
args: [ | ||
/etc/connectorData/data | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a lot cleaner to specify a list with -
instead of square brackets like this, like
args:
- /etc/connectorData/data
We use the same format in all of the repository, e-g see: https://github.com/zalando-incubator/kubernetes-on-aws/blob/dev/cluster/manifests/role-sync-controller/cronjob.yaml#L31
annotations: | ||
rollme: "Cd4Gg" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the purpose of this annotation? Is it to control rolling updates? We already have the CLM perform updates upon manifest changes made in the repository.
app.kubernetes.io/name: wiz-kubernetes-connector | ||
app.kubernetes.io/instance: wiz-connector |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we probably need to change these labels to application: wiz
and component: connector
everywhere. This is how we manage applications and components in our infrastructure.
image: "wiziopublic.azurecr.io/wiz-app/wiz-broker:2.5" | ||
imagePullPolicy: Always |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comments here to update the image URL and imagePullPolicy
.
command: | ||
- "wiz-broker" | ||
args: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove empty line
template: | ||
metadata: | ||
labels: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove empty line.
image: "wiziopublic.azurecr.io/wiz-app/wiz-broker:2.5" | ||
imagePullPolicy: Always |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same suggestion here to update this
- name: WIZ_ENV | ||
value: "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need an empty env var?
resourceNames: [ | ||
"wiz-api-token", | ||
"wiz-cluster-reader-token", | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As suggested above, let's use the -
format when listing values on multiple lines:
resourceNames:
- wiz-api-token
- wiz-cluster-reader-token
- > | ||
wiz-broker delete-kubernetes-connector | ||
--input-secrets-namespace | ||
"default" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this namespaces should be wiz
and not default
.
I'm looking at the secrets.yaml
file below and it's deploying the secret in the wiz
namespace.
apiVersion: v1 | ||
kind: Namespace | ||
metadata: | ||
name: wiz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you need to rename the file to 001-namespace.yaml
to ensure that CDP deploys it before any other manifest. Otherwise, you will run into issues. Similarly, you want to ensure that the RBAC resources are deployed before any deployment/job objects are deployed.
You can see how we do it for karpenter
for inspiration on how to manage precedence of manifest application: https://github.com/zalando-incubator/kubernetes-on-aws/tree/dev/cluster/manifests/z-karpenter
metadata: | ||
name: wiz-sensor | ||
labels: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: empty line should be removed
- matchExpressions: | ||
- key: dedicated | ||
operator: NotIn | ||
values: | ||
- skipper-ingress |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also have skipper
and skipper-ingress-redis
labelled nodes that we will have to skip if we're skipping skipper-ingress
.
We also have other "system" dedicated nodes like zmon-worker-arm64
, zmon-redis
and zmon-appliance
labelled nodes for ZMON.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming we are scoping the Wiz system only to "application worker nodes", please correct me if this isn't necessarily the case.
image: wizio.azurecr.io/sensor:v1 | ||
imagePullPolicy: Always |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo: this needs updating, as suggested above
- key: node.kubernetes.io/not-ready | ||
operator: Exists |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see for zalando.org/node-not-ready
taint below we added effect: NoSchedule
. We probably need the same behavior for the node.kubernetes.io/not-ready
taint as well?
- name: HELM_CHART_COMPAT_VER | ||
value: "1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably need to drop this ENVVAR.
- name: LOG_FILE | ||
value: "/wiz-sensor-store/sensor.log" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see below the tmp-store
emptyDir
type volume that is backing the /wiz-sensor-store
directory is only 100Mi
in size. This might not be suitable for logs as I see the log level set to info
. We probably need a bigger volume here.
I see you specified "Memory"
backend. From the Kubernetes docs:
While tmpfs is very fast be aware that, unlike disks, files you write count against the memory limit of the container that wrote them
I suggest you can just drop the "Memory"
backend and use a disk backed emptyDir
volume that's not "too large", to rely on the root FS, at the same time not exhaust the root disk of the node where it runs.
We might want to make a note of monitoring the log volume Wiz generates once we roll it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I see the logging-destination
annotation missing on the pod templates in the application manifests, please add that. Example: https://github.com/zalando-incubator/kubernetes-on-aws/blob/dev/cluster/manifests/z-karpenter/deployment.yaml#L28
This will ensure that the Wiz logs are sent to Scalyr as well. This is a built-in feature provided in the infrastructure and the necessary information required to setup the connectivity with the Scalyr account is already injected to the pod when it's admitted, so you just need to provide this annotation and logs should show up in Scalyr.
That should make it safe to have a somewhat smaller volume for logging locally to the disk, since logs will be pushed to Scalyr already.
- name: CRI_SOCKET_CUSTOM_PATH | ||
value: | ||
- name: HTTP_PROXY_URL | ||
value: | ||
- name: HTTP_PROXY_USERNAME | ||
value: | ||
- name: HTTP_PROXY_PASSWORD | ||
value: | ||
- name: HTTP_PROXY_CERT | ||
value: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can just drop these empty ENVVARs?
clientId: "{{ .Cluster.ConfigItems.wiz_api_client_id | base64 }}" | ||
clientToken: "{{ .Cluster.ConfigItems.wiz_api_client_token | base64 }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose the same applies here, do we need to drop the base64
template function?
# Source: wiz-sensor/templates/imagepullsecret.yaml | ||
apiVersion: v1 | ||
kind: Secret | ||
type: kubernetes.io/dockerconfigjson | ||
metadata: | ||
name: wiz-sensor-imagepullkey | ||
labels: | ||
|
||
helm.sh/chart: wiz-sensor-1.0.4760 | ||
image/tag: v1 | ||
app.kubernetes.io/name: wiz-sensor | ||
app.kubernetes.io/instance: wiz-sensor | ||
app.kubernetes.io/version: "1.0.4708" | ||
app.kubernetes.io/managed-by: Helm | ||
namespace: wiz | ||
data: | ||
.dockerconfigjson: "{{ .Cluster.ConfigItems.wiz_sensor_dockerconfigjson }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks strange 😅 I suggest we do the following:
- Instead of a Secret, make this manifest a
ConfigMap
that keeps the docker config JSON. - We expose only the token itself as a config-item, instead of
wiz_sensor_dockerconfigjson
. (or username and token both can be config-items) - We populate the token in the Docker config JSON using that config-item in the ConfigMap instead.
-- add wiz manifests