Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support for journal hibernation #411

Merged
merged 9 commits into from
Jan 14, 2025
Merged

support for journal hibernation #411

merged 9 commits into from
Jan 14, 2025

Conversation

jgraettinger
Copy link
Contributor

@jgraettinger jgraettinger commented Dec 15, 2024

  • Implement journal suspension and auto-suspension, under which journals use fewer or no runtime resources if they're not being actively appended to or if they have no content.

  • Overhaul the CI build system and introduce ARM64 builds of the Gazette broker.

  • Various other small updates; see commits.

This change is Reviewable

@jgraettinger jgraettinger force-pushed the johnny/hibernate branch 11 times, most recently from 59e72fc to 01ddcb3 Compare December 24, 2024 22:49
@jgraettinger jgraettinger marked this pull request as ready for review January 9, 2025 20:27
@jgraettinger jgraettinger requested a review from psFried January 9, 2025 20:27
Copy link
Contributor

@psFried psFried left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really nice. I left a few inline comments, though IDK whether any of them should block merging. I think the docs in overview-build-and-test.rst could use updated before merging, though.

@@ -15,7 +15,8 @@ import (

// DiagnosticsConfig configures pull-based application metrics, debugging and diagnostics.
type DiagnosticsConfig struct {
// Nothing to see here (yet).
Port string `long:"port" env:"PORT" description:"Port for diagnostics"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would happen if you pass --private but no --port argument? Would that effectively disable the diagnostics endpoints? Should we validate that port is present if private is true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it would effectively disable diagnostics. I'll add a logged warning.

//
// The Suspend field is managed by Gazette, and is updated when an Append RPC
// suspends a journal or if auto-suspension is enabled. Operators should
// not set it directly. However when utilizing suspension, operators MUST
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you must take care not to accidentally clobber the current value of suspend when running, for example, gazctl journals apply. What would happen if you changed the value of suspend anyway? Would the broker actually change the suspension state of the journal, or does it just reject the request?

I'm not seeing where it'd be rejecting the request, so I'm guessing it'd actually update the value. This seems a little surprising to me. I don't really see a use case for allowing clients to directly change these fields, so it seems like maybe we ought to reject Apply requests that attempt to change it. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would happen if you changed the value of suspend anyway?

  • If the journal is partially suspended (one read-only replica), then the overwrite un-suspends it and there's no harm done. The prior replica already had the correct resume offset which is synchronizes with new topology peers.

  • If the journal is partially suspended AND the single scaled-down replica crashes at the same time, you may need to run reset head in order to resume appends. The precondition was that all data is already in cloud storage, but it cannot now use the suspend offset to automatically resume the next append offset.

  • If the journal is fully suspended (no data exists in cloud storage) and its spec is clobbered, then it will become active and begin writing from offset zero.

I'm not seeing where it'd be rejecting the request, so I'm guessing it'd actually update the value. This seems a little surprising to me.

That's correct, it will overwrite it. From a technical angle, the control flow for applying a spec update verifies the applied spec is valid, but doesn't even consider the prior spec state (only the client-supplied ModRevision).

We could do more validation here, but i didn't feel it was worth it and i can also imagine scenarios where it's desirable to manually change these fields.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the prior value was unsuspended and you set it to either partially or fully suspended, would the brokers handle that by either partially or fully suspending the journal?

I'm also wondering how this would work in the scenario where an older client version is running against a newer broker. The old client would presumably leave the suspend field blank, which seems like it'd effectively clobber whatever is there. We might just say not to use journal suspension unless all your clients are updated, though that does make me a little nervous because it'd be easy for someone to accidentally use an outdated gazctl.

I'm not really sure how important those concerns are, but I'm mainly just trying to understand what we'll need to do as part of the upgrade and what the risks are with the Apply RPC.

broker/append_fsm.go Show resolved Hide resolved
broker/append_fsm.go Show resolved Hide resolved
broker/append_fsm.go Show resolved Hide resolved
When `Port` is set, serve diagnostics (on http.DefaultServeMux) over the indicated port.

When `Private` is set, don't serve diagnostics over the main service port.
Truncating slices but holding their memory means that the garbage
collector is unable to free the backing files of persisted spools which
happen to be resident in the queue.
@jgraettinger
Copy link
Contributor Author

Made some lightweight updates to overview-build-and-test.rst, mostly removing stuff.

I backed out upgrades of gRPC, as changes there have altered memory semantics compared to its prior behavior (it now assumes it owns buffers of messages sent into SendMsg, where it hadn't before and we relied on that behavior).

Copy link
Contributor

@psFried psFried left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Introduce a journal suspension mechanism which is incorporated into the
Append RPC state machine, and stored as a `suspend` field within each
JournalSpec.

Suspended journals consumer fewer or no assignment slots, reducing
broker and Etcd resources required for journals which are not actively
serving appends.

When running with the `--auto-suspend` flag, brokers will automatically
suspend journals which are fully flushed to the fragment index.
Previously, a `stopCh` signal could hang the "pump" goroutine, sending
into a channel which is never read.

Instead, ensure we fully consume the pump channel and use a context
cancellation to ensure it exits on `stopCh`.
Junk the docker-based builder strategy and instead use the GitHub
Actions CI environment for builds.

Introduce an ARM64 broker variant using cross-compilation and the Github
docker build action with QEMU.

Switch to using the librocksdb8.9 library provided by Ubuntu 24.04
within CI and within the "examples" image, rather than pinning RocksDB
and building it ourselves.

Switch to Github container registry for release builds.
Remove --disable-stores behavior, which causes fragments to never persist
to their configured remote store. This flag has historically been used
in testing contexts.

Replace it with --file-only, which achieves a similar testing outcome by
re-interpreting a s3://bucket/foobar store to instead use file:///foobar

This preserves the desired behavior of keeping changes local and not
writing to remote fragment stores, while allowing for test brokers
to be restarted without data loss, as well as allowing for testing
of features that require persistence (like journal suspension).
@jgraettinger jgraettinger force-pushed the johnny/hibernate branch 2 times, most recently from afe1503 to 95692a2 Compare January 14, 2025 04:42
@jgraettinger jgraettinger merged commit ea79744 into master Jan 14, 2025
2 checks passed
@jgraettinger jgraettinger deleted the johnny/hibernate branch January 14, 2025 04:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants