Skip to content

Commit

Permalink
handle node failure more better
Browse files Browse the repository at this point in the history
  • Loading branch information
tobru committed Sep 16, 2024
1 parent 441de0b commit d84ea0f
Show file tree
Hide file tree
Showing 2 changed files with 13 additions and 1 deletion.
9 changes: 9 additions & 0 deletions deployment/apps/http-echo/statefulset.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,15 @@ spec:
app: http-echo
spec:
terminationGracePeriodSeconds: 1
tolerations:
- key: "node.kubernetes.io/unreachable"
operator: "Exists"
effect: "NoExecute"
tolerationSeconds: 30
- key: "node.kubernetes.io/not-ready"
operator: "Exists"
effect: "NoExecute"
tolerationSeconds: 30
containers:
- name: echo
image: docker.io/hashicorp/http-echo:1.0
Expand Down
5 changes: 4 additions & 1 deletion podstatus/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -115,12 +115,15 @@ def watch_pods():
):
pod = event["object"]
pod_name = pod.metadata.name
pod_status = pod.status.phase
pod_index = pod.metadata.labels.get(
"statefulset.kubernetes.io/pod-name", "unknown"
)
pod_node = pod.spec.node_name if pod.spec.node_name else "unknown"

pod_status = pod.status.phase
if pod.metadata.deletion_timestamp is not None:
pod_status = "Terminating"

yield f'data: {{"name": "{pod_name}", "status": "{pod_status}", "index": "{pod_index}", "node": "{pod_node}"}}\n\n'
except Exception as e:
logging.error(f"Error watching pods: {e}")
Expand Down

0 comments on commit d84ea0f

Please sign in to comment.