You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To Reproduce
Scale down some deployment of app unable to stop in the allowed grace period.
Expected behavior
No alert if pods killed after grace period of a normal cluster behavior (scaling, rearrangement,..)
Actual behavior
kubelet log :
I0621 09:54:32.178488 1975 kuberuntime_container.go:742] "Killing container with a grace period" pod="google-sync-app-master/hutch-65d56f4988-9vw5f" podUID=1caade92-2bb7-4542-a0e0-acdee0df6c47 containerName="hutch-container" containerID="containerd://4031df8df9362d4df69ace1af956eb7430fa0b4e819f4059c54c790e28a2bd61" gracePeriod=30
kwatch log triggering notif :
{"level":"info", "msg":"sending event: {PodName:hutch-65d56f4988-9vw5f ContainerName:hutch-container Namespace:google-sync-app-master Reason:Error Events:[2024-06-21 09:54:32 +0000 UTC] Killing Stopping container hutch-container Logs:Docker Starting hutch in hutch-start.sh
ENVKEY_ENV: preprod
PING rabbitmq.rabbitmq.svc.cluster.local:15672
RabbitMQ is UP!
2024-06-19T08:30:08Z 19 INFO -- writing pid in /home/effilab/tmp/hutch.pid
2024-06-19T08:30:08Z 19 INFO -- hutch booted with pid 19
2024-06-19T08:30:08Z 19 INFO -- found rails project (.), booting app in preprod environment
Labels:map[app:google-sync-app-master pod-template-hash:65d56f4988 role:hutch]}"}
Version/Commit
All fine before 0.9
Notification not triggered with the 0.9 but could be a consequence of others bug fixed in subsequent releases like 0.9 logs give these sorts of logs :
{"level":"info","msg":"container only issue nginx tag-xy-6ff64687c7-zsmb4 tag-xy-6ff64687c7 Error 137","time":"2024-06-21T14:53:19Z"}
The text was updated successfully, but these errors were encountered:
Describe the bug
Sounds the ignore failed graceful shutdown feature not working correctly since last versions. It was fine before before 0.9
Sounds some consequences of refactoring done in #280
in particular https://github.com/abahmed/kwatch/blob/main/filter/containerKillingFilter.go
To Reproduce
Scale down some deployment of app unable to stop in the allowed grace period.
Expected behavior
No alert if pods killed after grace period of a normal cluster behavior (scaling, rearrangement,..)
Actual behavior
Version/Commit
All fine before 0.9
Notification not triggered with the 0.9 but could be a consequence of others bug fixed in subsequent releases like 0.9 logs give these sorts of logs :
{"level":"info","msg":"container only issue nginx tag-xy-6ff64687c7-zsmb4 tag-xy-6ff64687c7 Error 137","time":"2024-06-21T14:53:19Z"}
The text was updated successfully, but these errors were encountered: