You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While operating a Solace cluster provisioned via Helm, we faced a situation where the readiness probe wasn't capable of updating the "active" label of messaging pods, nevertheless the script readiness_check.sh reported 0 as return code and the pods continued being seen as read. As a consequence, the service forwarded traffic to the inactive node, which then rejected connections.
Steps to reproduce the issue:
Provision a Solace cluster. The primary node should have label "active" set to "true" and the backup node should have the label "active" set to "false"
Remove the created rolebinding so that the used service account doesn't have permission to call the pod patch API
Execute a failover from the primary to the backup
Noticed behavior
The primary node continues being ready and has the label "active" set to true even if it is inactive.
The backup node continues being ready and has the label "active" set to false even if it is active.
The service continues forwarding traffic to the primary node, which is inactive and then rejects connections.
Expected behavior
Both pods should be marked as not ready, as the readiness probe can't call the pod patch API. The script readiness_check.sh should return a different return code than 0.
Probable cause
Following 2 calls of the curl commands return code 0, even if the Kubernetes API returns HTTP 403.
solaceConfigMap.yaml
if ! curl -sS --output /dev/null --cacert $CACERT --connect-timeout 5 \
--request PATCH --data "$(cat /tmp/patch_label.json)" \
-H "Authorization: Bearer $KUBE_TOKEN" -H "Content-Type:application/json-patch+json" \
$K8S/api/v1/namespaces/$NAMESPACE/pods/$HOSTNAME ; then
# Label update didn't work this way, fall back to alternative legacy method to update label
if ! curl -sSk --output /dev/null -H "Authorization: Bearer $KUBE_TOKEN" --request PATCH --data "$(cat /tmp/patch_label.json)" \
-H "Content-Type:application/json-patch+json" \
https://$KUBERNETES_SERVICE_HOST:$KUBERNETES_PORT_443_TCP_PORT/api/v1/namespaces/$STATEFULSET_NAMESPACE/pods/$HOSTNAME ; then
echo "`date` ERROR: ${APP}-Unable to update pod label, check access from pod to K8s API or RBAC authorization" >&2
rm -f ${FINAL_ACTIVITY_LOGGED_TRACKING_FILE}; exit 1
fi
fi
The text was updated successfully, but these errors were encountered:
While operating a Solace cluster provisioned via Helm, we faced a situation where the readiness probe wasn't capable of updating the "active" label of messaging pods, nevertheless the script readiness_check.sh reported 0 as return code and the pods continued being seen as read. As a consequence, the service forwarded traffic to the inactive node, which then rejected connections.
Steps to reproduce the issue:
Noticed behavior
The primary node continues being ready and has the label "active" set to true even if it is inactive.
The backup node continues being ready and has the label "active" set to false even if it is active.
The service continues forwarding traffic to the primary node, which is inactive and then rejects connections.
Expected behavior
Both pods should be marked as not ready, as the readiness probe can't call the pod patch API. The script readiness_check.sh should return a different return code than 0.
Probable cause
Following 2 calls of the curl commands return code 0, even if the Kubernetes API returns HTTP 403.
solaceConfigMap.yaml
The text was updated successfully, but these errors were encountered: