Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Release-1.25] - Bump k3s for etcd s3 fixes #5073

Closed
brandond opened this issue Nov 21, 2023 · 3 comments
Closed

[Release-1.25] - Bump k3s for etcd s3 fixes #5073

brandond opened this issue Nov 21, 2023 · 3 comments
Assignees

Comments

@brandond
Copy link
Member

Backport fix for Bump k3s for etcd s3 fixes

@mdrahman-suse
Copy link
Contributor

Validated on version v1.25.16-rc1+rke2r1

https://github.com/k3s-io/k3s/issues/8918

Environment Details

Infrastructure

  • Cloud
  • Hosted

Node(s) CPU architecture, OS, and Version:

Ubuntu 22.04

Cluster Configuration:

1 server

Config.yaml:

write-kubeconfig-mode: 644
token: summerheat
node-name: server1
node-external-ip: <publicIP>
debug: true

Testing Steps

  1. Copy config.yaml
$ sudo mkdir -p /etc/rancher/rke2 && sudo cp config.yaml /etc/rancher/rke2
  1. Install RKE2
  2. Perform rke2 etcd snapshot save on s3 with s3 prop as invalid data
sudo rke2 etcd-snapshot save   --s3    --s3-endpoint="invalid"  --s3-bucket="invalid"   --s3-folder="invalid"   --s3-access-key="invalid"    --s3-secret-key="invalid"    --s3-region="invalid"
  1. Ensure the error is handled accordingly

Replication Results:

  • rke2 version used for replication:
rke2 version v1.25.15+rke2r2 (390ad79ce05d2c1192ee5871d780c325629420c4)
go version go1.20.10 X:boringcrypto
  • Observed similar panic in almost all the invalid cases with a variation in WARN for specific invalid props
WARN[0000] Unable to initialize S3 client: Access Denied.
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x289f559]

goroutine 1 [running]:
github.com/k3s-io/k3s/pkg/etcd.(*S3).snapshotRetention(0xc0006a79ca?, {0x3adbbe8?, 0xc0006f02d0?})
	/go/pkg/mod/github.com/k3s-io/[email protected]/pkg/etcd/s3.go:284 +0x59
github.com/k3s-io/k3s/pkg/etcd.(*ETCD).Snapshot(0xc0007f8140, {0x3adbbe8, 0xc0006f02d0})
	/go/pkg/mod/github.com/k3s-io/[email protected]/pkg/etcd/snapshot.go:375 +0x13ca
github.com/k3s-io/k3s/pkg/cli/etcdsnapshot.save(0xc0006cba20, 0x14?)
	/go/pkg/mod/github.com/k3s-io/[email protected]/pkg/cli/etcdsnapshot/etcd_snapshot.go:126 +0x92
github.com/k3s-io/k3s/pkg/cli/etcdsnapshot.Save(0x552c080?)
	/go/pkg/mod/github.com/k3s-io/[email protected]/pkg/cli/etcdsnapshot/etcd_snapshot.go:109 +0x45
github.com/k3s-io/k3s/pkg/cli/etcdsnapshot.Run(0xc0006cba20?)
	/go/pkg/mod/github.com/k3s-io/[email protected]/pkg/cli/etcdsnapshot/etcd_snapshot.go:101 +0x19
github.com/urfave/cli.HandleAction({0x2f86640?, 0x37620b8?}, 0x4?)
	/go/pkg/mod/github.com/urfave/[email protected]/app.go:524 +0x50
github.com/urfave/cli.Command.Run({{0x35bcc86, 0x4}, {0x0, 0x0}, {0x0, 0x0, 0x0}, {0x361ffd0, 0x22}, {0x0, ...}, ...}, ...)
	/go/pkg/mod/github.com/urfave/[email protected]/command.go:175 +0x67b
github.com/urfave/cli.(*App).RunAsSubcommand(0xc000334a80, 0xc0006cb760)
	/go/pkg/mod/github.com/urfave/[email protected]/app.go:405 +0xe87
github.com/urfave/cli.Command.startApp({{0x35d500b, 0xd}, {0x0, 0x0}, {0x0, 0x0, 0x0}, {0x361ffd0, 0x22}, {0x0, ...}, ...}, ...)
	/go/pkg/mod/github.com/urfave/[email protected]/command.go:380 +0xb7f
github.com/urfave/cli.Command.Run({{0x35d500b, 0xd}, {0x0, 0x0}, {0x0, 0x0, 0x0}, {0x361ffd0, 0x22}, {0x0, ...}, ...}, ...)
	/go/pkg/mod/github.com/urfave/[email protected]/command.go:103 +0x845
github.com/urfave/cli.(*App).Run(0xc0003348c0, {0xc000845900, 0xc, 0x14})
	/go/pkg/mod/github.com/urfave/[email protected]/app.go:277 +0xb87
main.main()
	/source/main.go:23 +0xa3e

Validation Results:

  • rke2 version used for validation:
rke2 version v1.25.16-rc1+rke2r1 (760de1e9851f5af9782a9437b2e9cc49c1f743f9)
go version go1.20.11 X:boringcrypto
  • No panic observed for invalid s3 prop (accesskey/bucket-name)
INFO[0000] Checking if S3 bucket <bucket> exists
WARN[0000] Unable to initialize S3 client: Access Denied.
INFO[0000] Reconciling ETCDSnapshotFile resources
INFO[0000] Checking if S3 bucket <bucket> exists
WARN[0000] Unable to initialize S3 client: Access Denied.
INFO[0000] Reconciliation of ETCDSnapshotFile resources complete
FATA[0000] Access Denied.
$ kubectl get etcdsnapshotfile | grep s3-on-demand-server
s3-on-demand-server1-1701215526-41242b      on-demand-server1-1701215526   s3                                                                                        0         2023-11-28T23:52:06Z

$ kubectl get etcdsnapshotfile s3-on-demand-server1-1701215526-41242b -o yaml
apiVersion: k3s.cattle.io/v1
kind: ETCDSnapshotFile
metadata:
  creationTimestamp: "2023-11-28T23:52:06Z"
  finalizers:
  - wrangler.cattle.io/managed-etcd-snapshots-controller
  generation: 1
  labels:
    etcd.rke2.cattle.io/snapshot-storage-node: s3
  name: s3-on-demand-server1-1701215526-41242b
  resourceVersion: "3949"
  uid: b41b573c-218a-4710-9b58-833126b05a94
spec:
  location: ""
  nodeName: s3
  s3:
    bucket: <bucket>
    endpoint: s3.amazonaws.com
    prefix: rke2
    region: us-east-2
  snapshotName: on-demand-server1-1701215526
status:
  creationTime: "2023-11-28T23:52:06Z"
  error:
    message: Access Denied.
    time: "2023-11-28T23:52:06Z"
  readyToUse: false
  size: "0"

@VestigeJ
Copy link
Contributor

VestigeJ commented Nov 29, 2023

##Environment Details
Validated using VERSION=v1.25.16+rke2r1

Infrastructure

  • Cloud

Node(s) CPU architecture, OS, and version:

Linux 5.14.21-150500.53-default x86_64 GNU/Linux
PRETTY_NAME="SUSE Linux Enterprise Server 15 SP5"

Cluster Configuration:

NAME             STATUS   ROLES                       AGE     VERSION
ip-12-13-14-23   Ready    control-plane,etcd,master   6m33s   v1.25.16+rke2r1
ip-12-13-14-15   Ready    <none>                      8m30s   v1.25.16+rke2r1
ip-12-13-14-36   Ready    control-plane,etcd,master   16m     v1.25.16+rke2r1
ip-12-13-14-76   Ready    control-plane,etcd,master   7m45s   v1.25.16+rke2r1

Config.yaml:

server: https://12.14.13.15:9345
write-kubeconfig-mode: 644
debug: true
token: YOUR_TOKEN_HERE
profile: cis
selinux: true
node-external-ip: 137.0.0.1
etcd-s3: true
etcd-s3-bucket: "k3s-etcd-testing"
etcd-s3-endpoint: "shmugatu.r2.cloudflarestorage.com"
etcd-s3-access-key: "shmugatuutagumhs"
etcd-s3-secret-key: "shmugatuutagumhsshmugatuutagumhs"

Validation Steps R2

$ curl https://get.rke2.io --output install-"rke2".sh
$ sudo chmod +x install-"rke2".sh
$ sudo groupadd --system etcd && sudo useradd -s /sbin/nologin --system -g etcd etcd
$ sudo modprobe ip_vs_rr
$ sudo modprobe ip_vs_wrr
$ sudo modprobe ip_vs_sh
$ sudo printf "on_oovm.panic_on_oom=0 \nvm.overcommit_memory=1 \nkernel.panic=10 \nkernel.panic_ps=1 \nkernel.panic_on_oops=1 \n" > ~/60-rke2-cis.conf
$ sudo cp 60-rke2-cis.conf /etc/sysctl.d/
$ sudo systemctl restart systemd-sysctl
$ sudo INSTALL_RKE2_VERSION=v1.25.16-rc1+rke2r1 INSTALL_RKE2_EXEC=server ./install-rke2.sh
$ go_rke2
$ sudo /usr/local/bin/rke2 etcd-snapshot save

Results:

$ sudo /usr/local/bin/rke2 etcd-snapshot save

WARN[0000] Unknown flag --server found in config.yaml, skipping
WARN[0000] Unknown flag --write-kubeconfig-mode found in config.yaml, skipping
WARN[0000] Unknown flag --token found in config.yaml, skipping
WARN[0000] Unknown flag --profile found in config.yaml, skipping
WARN[0000] Unknown flag --selinux found in config.yaml, skipping
WARN[0000] Unknown flag --node-external-ip found in config.yaml, skipping
DEBU[0000] Attempting to retrieve extra metadata from rke2-etcd-snapshot-extra-metadata ConfigMap
DEBU[0000] Error encountered attempting to retrieve extra metadata from rke2-etcd-snapshot-extra-metadata ConfigMap, error: configmaps "rke2-etcd-snapshot-extra-metadata" not found
INFO[0000] Saving etcd snapshot to /var/lib/rancher/rke2/server/db/snapshots/on-demand-ip-1-1-2-3-1701294147
{"level":"info","ts":"2023-11-29T21:42:26.681Z","caller":"snapshot/v3_snapshot.go:65","msg":"created temporary db file","path":"/var/lib/rancher/rke2/server/db/snapshots/on-demand-ip-1-1-2-3-1701294147.part"}
{"level":"info","ts":"2023-11-29T21:42:26.685Z","logger":"client","caller":"[email protected]/maintenance.go:211","msg":"opened snapshot stream; downloading"}
{"level":"info","ts":"2023-11-29T21:42:26.685Z","caller":"snapshot/v3_snapshot.go:73","msg":"fetching snapshot","endpoint":"https://127.0.0.1:2379"}
{"level":"info","ts":"2023-11-29T21:42:26.775Z","logger":"client","caller":"[email protected]/maintenance.go:219","msg":"completed snapshot read; closing"}
{"level":"info","ts":"2023-11-29T21:42:26.804Z","caller":"snapshot/v3_snapshot.go:88","msg":"fetched snapshot","endpoint":"https://127.0.0.1:2379","size":"11 MB","took":"now"}
{"level":"info","ts":"2023-11-29T21:42:26.804Z","caller":"snapshot/v3_snapshot.go:97","msg":"saved","path":"/var/lib/rancher/rke2/server/db/snapshots/on-demand-ip-1-1-2-3-1701294147"}
INFO[0000] Checking if S3 bucket k3s-etcd-testing exists
INFO[0000] S3 bucket k3s-etcd-testing exists
INFO[0000] Saving etcd snapshot on-demand-ip-1-1-2-3-1701294147 to S3
INFO[0000] Uploading snapshot to s3://k3s-etcd-testing//var/lib/rancher/rke2/server/db/snapshots/on-demand-ip-1-1-2-3-1701294147
INFO[0001] Uploaded snapshot metadata s3://k3s-etcd-testing/.metadata/on-demand-ip-1-1-2-3-1701294147
INFO[0001] S3 upload complete for on-demand-ip-1-1-2-3-1701294147
INFO[0001] Reconciling ETCDSnapshotFile resources
DEBU[0001] Found snapshotFile for on-demand-ip-1-1-2-3-1701294135 with key local-on-demand-ip-1-1-2-3-1701294135
DEBU[0001] Found snapshotFile for on-demand-ip-1-1-2-3-1701294147 with key local-on-demand-ip-1-1-2-3-1701294147
DEBU[0001] Found snapshotFile for on-demand-ip-1-1-2-3-1701294147 with key s3-on-demand-ip-1-1-2-3-1701294147
DEBU[0001] Found ETCDSnapshotFile for on-demand-ip-1-1-2-3-1701294135 with key local-on-demand-ip-1-1-2-3-1701294135
DEBU[0001] Found ETCDSnapshotFile for on-demand-ip-1-1-2-3-1701294147 with key local-on-demand-ip-1-1-2-3-1701294147
DEBU[0001] Found ETCDSnapshotFile for on-demand-ip-1-1-2-3-1701294135 with key s3-on-demand-ip-1-1-2-3-1701294135
DEBU[0001] Found ETCDSnapshotFile for on-demand-ip-1-1-2-3-1701294147 with key s3-on-demand-ip-1-1-2-3-1701294147
INFO[0001] Reconciliation of ETCDSnapshotFile resources complete

@mdrahman-suse
Copy link
Contributor

Closing this but there is a nit fix that will be added in the next release which needs to be validated in rke2 and tracked in k3s: k3s-io/k3s#8925

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants