Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(manager): resume block production right when skew < max skew #1252

Merged
merged 8 commits into from
Nov 26, 2024

Conversation

srene
Copy link
Contributor

@srene srene commented Nov 21, 2024

PR Standards

Opening a pull request should be able to meet the following requirements

--

PR naming convention: https://hackmd.io/@nZpxHZ0CT7O5ngTp0TP9mg/HJP_jrm7A


Close #XXX

<-- Briefly describe the content of this pull request -->

For Author:

  • Targeted PR against correct branch
  • included the correct type prefix in the PR title
  • Linked to Github issue with discussion and accepted design
  • Targets only one github issue
  • Wrote unit and integration tests
  • All CI checks have passed
  • Added relevant godoc comments

For Reviewer:

  • confirmed the correct type prefix in the PR title
  • Reviewers assigned
  • confirmed all author checklist items have been addressed

After reviewer approval:

  • In case targets main branch, PR should be squashed and merged.
  • In case PR targets a release branch, PR should be rebased.

@srene srene requested a review from a team as a code owner November 21, 2024 09:55
@srene srene marked this pull request as draft November 21, 2024 11:31
@srene srene self-assigned this Nov 21, 2024
@srene srene marked this pull request as ready for review November 21, 2024 12:04
@omritoptix omritoptix changed the title fix(manager): fix to max skew time pr pending bytes fix(manager): resume block production right when skew < max skew Nov 22, 2024
block/produce.go Outdated
@@ -82,7 +82,7 @@ func (m *Manager) ProduceBlockLoop(ctx context.Context, bytesProducedC chan int)
return nil
case bytesProducedC <- bytesProducedN:
default:
evt := &events.DataHealthStatus{Error: fmt.Errorf("bytes produced channel is full: %w", gerrc.ErrResourceExhausted)}
evt := &events.DataHealthStatus{Error: fmt.Errorf("Block production paused. Time between last block produced and last block submitted higher than max skew time %s: %w", m.Conf.MaxSkewTime, gerrc.ErrResourceExhausted)}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably worth also putting here the last block subimtted time

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

block/submit.go Outdated
Comment on lines 113 to 114
pendingBytes.Store(pending)
ticker.Reset(maxBatchSubmitTime)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this. can you please explain? thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved after loop and documented.

@srene srene marked this pull request as draft November 25, 2024 14:22
@srene srene marked this pull request as ready for review November 25, 2024 14:33
pending = uint64(unsubmittedBlocksBytes()) //nolint:gosec // bytes size is always positive
logger.Info("Submitted a batch to both sub-layers.", "n bytes consumed from pending", nConsumed, "pending after", pending) // TODO: debug level
pending = uint64(unsubmittedBlocksBytes()) //nolint:gosec // bytes size is always positive
if batchSkewTime() < maxSkewTime {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can u elaborate on the use of this if?
why not call trigger.Nudge() after submission loop finished as before?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it is called after the loop, the block production wont be restarted until all pending batches, that have been created before stopping block production, are submitted to SL. this way block production will be restarted after the skew time condition is met without having to wait.

@mtsitrin mtsitrin merged commit 7ef37e5 into main Nov 26, 2024
4 checks passed
@mtsitrin mtsitrin deleted the srene/pendingbytes-fix branch November 26, 2024 07:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants