Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update ha.md to add a note in section "1. Configure the Fixed Registr… #231

Closed
wants to merge 1 commit into from

Conversation

pdiaz-suse
Copy link
Contributor

…ation Address"

Pensionsmyndigheten customer suggested an update in our documentation to prevent race conditions in HA setups when using F5 load balancer. "They encounter random join failures for additional rke2 master nodes, as he described as follows: The problem is that rke2 immediately starts to listen to port 9345 when the rke2-server service is started (even though it has not completed (or even started) the installation process to be a functioning member of the cluster. This fools the load balancer into thinking that the new node is ready for requests on port 9345 while only the first node can actually respond to requests for joining the cluster. We solved this by adding additional port monitoring on port 6443 to determine if the nodes are ready in the load balancer."

…ation Address"

Pensionsmyndigheten customer suggested an update in our documentation to prevent race conditions in HA setups when using F5 load balancer.
"They encounter random join failures for additional rke2 master nodes, as he described as follows:
The problem is that rke2 immediately starts to listen to port 9345 when the rke2-server service is started (even though it has not completed (or even started) the installation process to be a functioning member of the cluster.
This fools the load balancer into thinking that the new node is ready for requests on port 9345 while only the first node can actually respond to requests for joining the cluster.
We solved this by adding additional port monitoring on port 6443 to determine if the nodes are ready in the load balancer."
@pdiaz-suse pdiaz-suse requested a review from a team as a code owner July 1, 2024 13:30
@@ -39,7 +39,7 @@ This endpoint can be set up using any number approaches, such as:

This endpoint can also be used for accessing the Kubernetes API. So you can, for example, modify your [kubeconfig](https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig/) file to point to it instead of a specific node.

Note that the `rke2 server` process listens on port `9345` for new nodes to register. The Kubernetes API is served on port `6443`, as normal. Configure your load balancer accordingly.
Note that the `rke2 server` process listens on port `9345` for new nodes to register. The Kubernetes API is served on port `6443`, as normal. Configure your load balancer accordingly. Note that when adding new nodes to the cluster, additional checks may be required in your load balancer to determine if the server is ready. For example, an additional monitoring check in the load balancer for port 6443 can help determine if the node can join the cluster.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Note that the `rke2 server` process listens on port `9345` for new nodes to register. The Kubernetes API is served on port `6443`, as normal. Configure your load balancer accordingly. Note that when adding new nodes to the cluster, additional checks may be required in your load balancer to determine if the server is ready. For example, an additional monitoring check in the load balancer for port 6443 can help determine if the node can join the cluster.
Note that the `rke2 server` process listens on port `9345` for new nodes to register. The Kubernetes API is served on port `6443`, as normal. Configure your load balancer accordingly. Additional checks may be required in your load balancer to determine if the server is ready. For example, an additional monitoring check in the load balancer for port 6443 can help determine if the node can join the cluster.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the discussion at rancher/rke2#5557 (comment)

Note that the ports to health-check depend on the role:

  • control-plane servers should use 6443 as a health check port
  • etcd servers should use 2379 as a health check port
  • servers with both control-plane and etcd roles should health-check both ports

You can also make an authenticated request to the readyz URL listed in that comment as a health-check which will cover both of the ports listed above, as well as internal readiness of RKE2 itself.

@dereknola dereknola closed this Oct 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants