Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reuse long running libp2p host for checks #53

Open
2color opened this issue Aug 29, 2024 · 2 comments
Open

Reuse long running libp2p host for checks #53

2color opened this issue Aug 29, 2024 · 2 comments
Labels
dif/medium Prior experience is likely helpful effort/days Estimated to take multiple days, but less than a week P2 Medium: Good to have, but can wait until someone steps up

Comments

@2color
Copy link
Member

2color commented Aug 29, 2024

What's the problem

Running the ipfs-check backend on a machine behind NAT will fail running checks against other peers behind NAT, because we create a short-lived test host for the check, which doesn't connect to any peers, so it can't find any observed addrs, preventing NAT hole punching from happening.

Idea

Reuse the long running libp2p peer for checks that don't involve an explicit multiaddr

Since the DHT traversal which happens as part of the check will likely open a connection (and hole punch) to a peer, we could reuse the long running libp2p host and pass it into Vole for the bitswap check.This would simplify the code and speed up the response.

Moreover, this would ensure higher success rates for users running this backend behind NAT.

Other ideas

Reduce activation threshold

Unfortunately, this configuration is global, so may not work for us, but this solved a similar problem in Vole https://github.com/ipfs-shipyard/vole/pull/39/files

Related issues

libp2p/go-libp2p#2941

@2color
Copy link
Member Author

2color commented Aug 30, 2024

Some more context from @aschmahmann on situations where reusing the long-running libp2p host may reduce insight:

  • You can end up with even less debug info. For example, what if the peer has somehow messed up their peer routing (e.g. advertising the wrong addresses via identify) but we previously connected to them via an explicit address. Now we'd get a success rather than a failure.

Seems like we could still end up in the same situation it's just that the use cases become more esoteric. A couple that come to mind:

  • You've setup a peer to public to the Amino + IPNI (later down the road, but not that much later)
    But addresses were set differently for each
  • You previously did an IPNI lookup and learned about an address for the peer which worked
    We later did a DHT lookup and none of those addresses would've worked, but we have the IPNI one in cache
  • Your addresses used to be correct when you advertised for CID A 10 minutes ago and those are used to find you (and are cached by the DHT server with the provider record for A). However, 10 minutes later you look for CID B advertised yesterday where only the provider record is present. Doing a fresh lookup for B would fail, since the current addresses you're providing via identify to the DHT server peers is bad, but we'd see a success because A is caching the old ones

These are all fairly edge cases, so if they're huge pains maybe it's worth ignoring them if we document the issues (so at least we know what the limitations are). If we can do our best to replicate what should happen with a sanely configured new peer though then there are fewer of the edge cases anyone even needs to have in mind if helping a user debug. So the benefit seems largely about taking this class of edge cases off the table as something to consider.


There are few areas where (go-)libp2p users not using Amino run into problems that are all in the same area:

  • Where do I discover relays for holepunching?
  • How do I discover my addresses?
  • How do I setup a new DHT (given that the nodes need to be public, but also they can't figure out their addresses without ones to bootstrap from)?

2color added a commit that referenced this issue Aug 30, 2024
@2color 2color mentioned this issue Aug 30, 2024
@2color
Copy link
Member Author

2color commented Sep 3, 2024

At the time of writing, we host this backend as a public good and don't run into this problem because we have a public IP which allows dialbacks and prevents the need for hole punching.

So this is mostly a problem if you run it behind NAT, i.e. in local development.

Some more ideas:

  • make the test host a DHT Client so that it meets the identify activation threshold by having more peers confirm the address
  • check for observed addresses or use Autonat to check whether we're behind NAT and warn the user after starting (so that at least they know the reason for failure)

@lidel lidel added dif/medium Prior experience is likely helpful effort/days Estimated to take multiple days, but less than a week P2 Medium: Good to have, but can wait until someone steps up labels Sep 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dif/medium Prior experience is likely helpful effort/days Estimated to take multiple days, but less than a week P2 Medium: Good to have, but can wait until someone steps up
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants