Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't pause between long polls #436

Merged
merged 1 commit into from
Dec 11, 2024
Merged

Conversation

DanGould
Copy link
Contributor

Pausing different amounts between clients creates a fingerprint. Set the pause to spec at 0 seconds.

@DanGould DanGould added bug Something isn't working payjoin-cli labels Dec 10, 2024
@DanGould DanGould requested a review from nothingmuch December 10, 2024 17:23
Pausing different amounts between clients creates a fingerprint.
Set the pause to spec at 0 seconds.
@coveralls
Copy link
Collaborator

coveralls commented Dec 10, 2024

Pull Request Test Coverage Report for Build 12261407940

Details

  • 0 of 2 (0.0%) changed or added relevant lines in 1 file are covered.
  • 3 unchanged lines in 2 files lost coverage.
  • Overall coverage increased (+5.1%) to 67.089%

Changes Missing Coverage Covered Lines Changed/Added Lines %
payjoin-cli/src/app/v2.rs 0 2 0.0%
Files with Coverage Reduction New Missed Lines %
payjoin-cli/src/app/v2.rs 1 21.4%
payjoin/src/send/mod.rs 2 92.78%
Totals Coverage Status
Change from base Build 12239588793: 5.1%
Covered Lines: 3121
Relevant Lines: 4652

💛 - Coveralls

Copy link
Collaborator

@nothingmuch nothingmuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, the code is very straight forward, so ACK on that.

The rationale I'm not 100% sure about.

First off, I would personally rationalize this more in terms of UX/responsiveness, long polling is kind of a hack to get around the client initiated communication pattern of HTTP and especially OHTTP, in an ideal world the listener would not need to poll at all, but given the constraints of the architecture it's necessary, and it follows that having the minimal amount of "dead" time in between polling attempts is advantageous for responsiveness.

Turning to rationalizing this from a privacy point of view, here I'm a little more dubious, because adding some amount of noise to the delays can be meaningful.

As I understand it the goal here is to discourage different implementations from picking different values.

Let's imagine a situation, there are 3 parties, Alice, Bob, and Carol, all using the same relay, but they are situated in different continents, so the end to end latency they each have against the relay is measurably different, and all of them transact with each other semi-regularly.

For a given subdirectory ID, the amount of time in between requests that the directory sees before this change was 5 seconds + R_i, where R is the round trip latency to the relay and i indexes { Alice, Bob or Carol }.

The directory will is able to measure R_i indirectly before this change, and almost directly after it, thereby potentially identifying who is the receiver and who is the sender, since those roles are distinguishable based on the latency patterns of a pair of potentially related (i.e. temporally close but satisfying the ordering constraints of the BIP 77 messaging patterns). In other words R_i is potentially identifying metadata that leaks for each subdirectory, and by correlating different subdirectories (for which differences in this noisy measurement are helpful), the directory could still heuristically identify the parties involved, or at least learn some information or constraints about them with the same difficulty both before and after this change.

If on the other hand the spec encouraged clients to first sample some additional noise to artificially add to the latency, I believe that would address the conern desscribed in the PR more comprehensively than simply divergence in fixed values chosen by different implementations.

Note that this noise could in principle be negative too, by sending the next request slightly before an in flight one is expected to time out and return, at the cost of a low probability of small waste in bandwidth.

Anyway, cACK, and ACK, but perhaps this should be followed up by something a little more defensive. That said, I'm not convinced such metadata leaks are justifiably in the threat model. Thoughts?

@DanGould
Copy link
Contributor Author

DanGould commented Dec 11, 2024

I'm going to merge this as-is with a merge commit that corrects the change commit's rationale to address 1. consistency 2. simplicity of implementation and 3. UX/responsiveness.

Then, I'm going to move your comment into a new issue so that we can decide yea or nay to include timing attacks in the threat model and document the decision. Adding noise does indeed make for better privacy against metadata timing attacks in some circumstances but I reckon we need to calculate at what levels of traffic adding noise even becomes useful against specific adversaries.

@DanGould DanGould merged commit 0891a5b into payjoin:master Dec 11, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working payjoin-cli
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants