Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: delay in TimeOffset applied to AdjustedTime introduced by send/r… #5946

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

techy2
Copy link

@techy2 techy2 commented Mar 21, 2024

On busy VPS and shared host with limited resources, the time between when a messages is sent to the
tcpip send or receive queue and when it is sent in the case of send queue, or when it is processed
(ProcessMessage) can be in excess of 30 seconds.This delay introduces a skew in AdjustedTime.

For the receive queue, the post processing uses the receive time prior to entering the queue to
calculate TimeOffset rather than Now() which currently includes the delay in the queue.

For the send queue, the queued message is altered to update the nTime of the message to the actual
time it is being sent rather than the time at which it was queued

Was tested on an hp 370 G6 24 core 3ghz 192gb host with the daemon launched with -par=2 to restrict
the resources of the daemon. Logging was added pre-patch to document the delay through the queue and
was observed for both send and receive to be occasionally > 30 seconds when the daemon was busy
following the tip or during reorgs when the cpu utilization for the assigned core approached 100%.
Significant queue delay occurs most often in the receive queue (several times a minute) and
infrequently in the send queue ( 1 observation in several hours of testing ).

Checklist:

Go over all the following points, and put an x in all the boxes that apply.

  • [x ] I have performed a self-review of my own code
  • [x ] I have commented my code, particularly in hard-to-understand areas
  • I have added or updated relevant unit/integration/functional/e2e tests
  • I have made corresponding changes to the documentation
  • I have assigned this pull request to a milestone (for repository code-owners and collaborators only)

Copy link
Member

@PastaPastaPasta PastaPastaPasta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall; I would say this is way to "hacky" as your comments even describe it to be merged. It also introduces technically undefined behavior. I would probably also want to see a test where this issue presents itself

Comment on lines +927 to +930
if (! *(char *)&bigint) { // if bigendian
nSendtime = bswap_64(uSendbuff.ui64_tm);
nNewSendtime = bswap_64(nNow);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The endianess check is somewhat unconventional. Using a standard library function or a more explicit check might improve clarity.

int64_t nNow = (pnode->IsInboundConn()) ? GetAdjustedTime() : GetTime();
int64_t nNewSendtime = nNow;
if (! *(char *)&bigint) { // if bigendian
nSendtime = bswap_64(uSendbuff.ui64_tm);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is (at least technically) undefined behavior here: the last write to the union was into the uchar[8]; then it's being read via uSendbuff.ui64_tm. According to c++17 standard; you may only read from a union the type which was most recently written to; anything else is undefined behavior.

Copy link

@UdjinM6 UdjinM6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see below + #5460 (comment) + #5460 (comment)

tl;dr:

  1. The fix on the sending side only helps non-malicious peers to behave a bit better under high load, malicious peers still can exploit this. Moreover, adjusted time is meant to be inaccurate by design and we can't fully trust it anyway, so imo we can simply ignore this part.
  2. Using both mockable time and non-mockable time in one equation feels wrong, the fix on the receiving side can cause issues in tests potentially. To lower the impact of malicious peers we could backport p2p: Don't use timestamps from inbound peers for Adjusted Time bitcoin/bitcoin#23631 instead.
  3. We don't merge into master.

re-org operations. This delay corrupts AdjustedTime/
*/
// int64_t nTimeOffset = nTime - GetTime();
int64_t nTimeOffset = nTime - (nTimeReceived/1000000);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mixing mockable/non-mockable time here (check CNode::ReceiveMsgBytes()), might cause test issues

Copy link

This pull request has conflicts, please rebase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants