Comment by __turbobrew__ 9 hours ago A good reason to health check the kubelet process and restart it when the checks fail. 2 comments __turbobrew__ Reply compumike 8 hours ago What kind of health checks? In my case, the kubelet process was staying alive and responsive to queries, I believe due to: # cat /proc/$(pgrep kubelet)/oom_score_adj -999 (from OOMScoreAdjust=-999 in /etc/systemd/system/kubelet.service) With this score, the Linux OOM killer wouldn't touch it, but any of my Pods were fair game. nijave 7 hours ago At the metrics level, you can compare old vs new release. Have been bitten before by resource requirements dramatically change (regardless of whether it's a bug or functionality change)
compumike 8 hours ago What kind of health checks? In my case, the kubelet process was staying alive and responsive to queries, I believe due to: # cat /proc/$(pgrep kubelet)/oom_score_adj -999 (from OOMScoreAdjust=-999 in /etc/systemd/system/kubelet.service) With this score, the Linux OOM killer wouldn't touch it, but any of my Pods were fair game. nijave 7 hours ago At the metrics level, you can compare old vs new release. Have been bitten before by resource requirements dramatically change (regardless of whether it's a bug or functionality change)
nijave 7 hours ago At the metrics level, you can compare old vs new release. Have been bitten before by resource requirements dramatically change (regardless of whether it's a bug or functionality change)
What kind of health checks? In my case, the kubelet process was staying alive and responsive to queries, I believe due to:
With this score, the Linux OOM killer wouldn't touch it, but any of my Pods were fair game.
At the metrics level, you can compare old vs new release. Have been bitten before by resource requirements dramatically change (regardless of whether it's a bug or functionality change)