-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Fallback RPC when the primary Sync data-source stalled #465
base: dz/fallback-rpc
Are you sure you want to change the base?
Conversation
let fetchNext: ( | ||
t, | ||
~fetchState: FetchState.t, | ||
~currentBlockHeight: int, | ||
~executeQuery: (FetchState.query, ~source: Source.t) => promise<unit>, | ||
~waitForNewBlock: (~currentBlockHeight: int) => promise<int>, | ||
~onNewBlock: (~currentBlockHeight: int) => unit, | ||
~maxPerChainQueueSize: int, | ||
~stateId: int, | ||
) => promise<unit> | ||
|
||
let waitForNewBlock: (t, ~currentBlockHeight: int) => promise<int> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to have them as two separate testable functions
~stalledPollingInterval: int=?, | ||
) => t | ||
|
||
let getActiveSource: t => Source.t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JonoPrest As you suggested, I stopped exposing the mutable field 👍
logger->Logging.childError( | ||
`No new blocks detected within ${(sourceManager.newBlockFallbackStallTimeout / 1000) | ||
->Int.toString}s. Polling will continue at a reduced rate. For better reliability, refer to our RPC fallback guide: https://docs.envio.dev/docs/HyperIndex/rpc-sync`, | ||
) | ||
| _ => | ||
logger->Logging.childWarn( | ||
`No new blocks detected within ${(sourceManager.newBlockFallbackStallTimeout / 1000) | ||
->Int.toString}s. Continuing polling with fallback RPC sources from the configuration.`, | ||
) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only show warning/error after 20s. Before this, all the other logs are traces.
| [] => | ||
logger->Logging.childError( | ||
`No new blocks detected within ${(sourceManager.newBlockFallbackStallTimeout / 1000) | ||
->Int.toString}s. Polling will continue at a reduced rate. For better reliability, refer to our RPC fallback guide: https://docs.envio.dev/docs/HyperIndex/rpc-sync`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll need to update this one, when there's a docs page 😁
// Show a higher level log if we displayed a warning/error after newBlockFallbackStallTimeout | ||
let log = status.contents === Stalled ? Logging.childInfo : Logging.childTrace | ||
logger->log({ | ||
"msg": `New blocks successfully found.`, | ||
"source": source.name, | ||
"newBlockHeight": newBlockHeight, | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what we discussed for other places to show a success log after a warning.
source.sourceFor === Sync || | ||
// Even if the active source is a fallback, still include | ||
// it to the list. So we don't wait for a timeout again | ||
// if all main sync sources are still not valid | ||
source === sourceManager.activeSource |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might cause problems for returning from fallback to HyperSync, but it may not be a big problem when we are at the head and fallback has smaller latency.
~delayMilliseconds=Pervasives.max( | ||
retryIntervalMillis.contents * backOffMultiplicative, | ||
60_000, | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a safeguard to not exceed a minute :)
let delayMilliseconds = if status.contents === Stalled { | ||
retryIntervalMillis := stalledPollingInterval // Reset possible backOff | ||
stalledPollingInterval | ||
} else { | ||
retryIntervalMillis := initalRetryIntervalMillis // Reset possible backOff | ||
source.pollingInterval | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not the most beautiful-looking code, but I well tested it 👍
logger->Logging.childTrace({ | ||
"msg": `Height retrieval from ${source.name} source failed. Retrying in ${retryIntervalMillis.contents->Int.toString}ms.`, | ||
"source": source.name, | ||
"error": exn->ErrorHandling.prettifyExn, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a warning anymore
What's left: