Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Work around uuid file corruption in balenaEngine health check #3409

Merged
merged 1 commit into from
Jun 6, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -7,24 +7,46 @@ set -o errexit
BALENAD_SOCKET="/run/balena.sock"
CONTAINERD_SOCKET="/run/balena-engine/containerd/balena-engine-containerd.sock"

# Checks if the file $1 contains an UUID in any of the formats described here:
# https://github.com/google/uuid/blob/6e10cd1027e225e3ad7bfcc13c896abd165b02ef/uuid.go#L189
contains_valid_uuid()
{
# This matches xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx, which is the one format
# we have seen in practice, and is also the base for most other supported
# formats.
BASE_UUID_REGEX="[[:xdigit:]]{8}-([[:xdigit:]]{4}-){3}[[:xdigit:]]{12}"

# Check for all the possible formats, starting with the most common one, to
# benefit from the short-circuited evaluation. The contents of the uuid file
# must match these formats exactly, so we anchor the regexes both at the
# beginning (^) and the end ($).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love the detailed comments!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe time travel is impossible, because otherwise I'd expect someone to have come back to thank me for those 😝

grep -E "^${BASE_UUID_REGEX}$" "$1" > /dev/null || \
grep -E "^\{${BASE_UUID_REGEX}\}$" "$1" > /dev/null || \
grep -E "^urn:uuid:${BASE_UUID_REGEX}$" "$1" > /dev/null || \
grep -E "^[[:xdigit:]]{32}$" "$1" > /dev/null
}


# Check if balena-engine-daemon is responding.
curl --fail --unix-socket $BALENAD_SOCKET http:/v1.40/_ping > /dev/null 2>&1

# Due to a non-atomic file creation and writing operation in containerd, we
# sometimes end up with an empty `uuid` file. This causes `ctr version` (and
# hence the health check) to fail. We therefore remove this file if it is
# present and empty. See https://github.com/balena-os/balena-engine/issues/322
# sometimes end up with an empty or corrupted `uuid` file. This causes
# `ctr version` (and hence the health check) to fail. We therefore remove this
# file if it is present not valid. See
# https://github.com/balena-os/balena-engine/issues/322

UUID_FILE="/mnt/data/docker/containerd/daemon/io.containerd.grpc.v1.introspection/uuid"
if [ -f "$UUID_FILE" -a ! -s "$UUID_FILE" ]; then
warn "removing empty $UUID_FILE"
if [ -f "$UUID_FILE" ] && ! contains_valid_uuid "$UUID_FILE"; then
warn "removing invalid $UUID_FILE"
rm -f "$UUID_FILE"
fi

# Check if balena-engine-containerd is responding.
balena-engine-containerd-ctr --address $CONTAINERD_SOCKET version > /dev/null 2>&1

# The uuid file is expected to exist and be non-empty after `ctr version`. If
# The uuid file is expected to exist and be valid after `ctr version`. If
# this is not the case, log the event.
if [ -f "$UUID_FILE" -a ! -s "$UUID_FILE" ]; then
warn "$UUID_FILE empty after 'ctr version'"
if [ -f "$UUID_FILE" ] && ! contains_valid_uuid "$UUID_FILE"; then
warn "$UUID_FILE invalid after 'ctr version'"
fi
Loading