Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate TCPunch on Google Cloud #2

Open
mcopik opened this issue Feb 16, 2023 · 7 comments
Open

Evaluate TCPunch on Google Cloud #2

mcopik opened this issue Feb 16, 2023 · 7 comments
Labels
enhancement New feature or request

Comments

@mcopik
Copy link
Contributor

mcopik commented Feb 16, 2023

We have verified that TCPunch works on the AWS cloud. However, it has yet to be established if the implemented NAT hole punching will work on the Google cloud. This step is necessary to run TCP communication between two different functions.

We should first run this on two VMs to verify that the connection is established, and then try to establish TCP connection between a VM and a function.

@PranayB003
Copy link

@mcopik I have completed the first part of this issue (establishing a connection between 2 VMs), and my work can be viewed in this repo. I'm working on the second part (establishing a connection between a VM and a function). Please do let me know if you require any changes or have suggestions!

@PranayB003
Copy link

@mcopik I've tested communication between a serverless function and a VM. It seems serverless function services (Cloud Function and Cloud Run) on GCP only allow incoming traffic over HTTP, and on a single port. Due to this, the hole punching server's response never reaches the function, and it times out. The outgoing request from the function does reach the hole punching server though. I think we can get around this issue by letting the client specify which port and protocol it expects a response on while calling pair(). What do you think?

@mcopik
Copy link
Contributor Author

mcopik commented Apr 10, 2024

@PranayB003 Thanks for the update! How did you arrive at the conclusion above? Is it the case that the function opens a connection to the hole punching server, sends a request, but never receives a reply from the server?

That would be a strange setting as it would effectively prevent making any HTTP requests from the function, e.g., to the database.

@PranayB003
Copy link

@mcopik You're right, I came to that conclusion because the server received the client's request and responded back, but the client never got the response. I confirmed this through the logs, kindly refer to these images of the Cloud Run logs and the hole punching server's logs.
cloud run log
hps log (vm)
Your comment about the function not being able to make HTTP requests has got me thinking too, logically speaking there should be a way to get back a response. I'm currently looking this up. Could you please tell me whether you faced any related issues when you first tried TCPunch on AWS? Any other advice is also greatly appreciated!

@PranayB003
Copy link

@mcopik The cloud run instance does receive the reply from the hole punching server, I checked by enabling the debugging statements in TCPunch. Please find below the logs that show this:
Screenshot 2024-04-11 at 1 48 34 AM
Screenshot 2024-04-11 at 1 48 06 AM

It seems the problem is that Cloud Run instances can make outgoing TCP connections (and subsequently send/receive messages on this connection) but cannot accept new incoming TCP connections (on arbitrary ports apart from the one that's open to HTTP requests), which is why the call to pair() keeps waiting to accept a connection from the peer VM and eventually times out.

@mcopik
Copy link
Contributor Author

mcopik commented Apr 16, 2024

@PranayB003 In general, functions cannot accept incoming connections - that's why we need the hole punching :)

On AWS Lambda, we sometimes had issues with the robustness of the TCP connection but never had problems with creating the connection. The only important factor was that if you try the VM-TCP connection, the VM needs to have its security policies updated such that it allows all incoming connections on ports since our hole punching implementation was not restricted to any specific port selection.

You said that "but cannot accept new incoming TCP connections (on arbitrary ports apart from the one that's open to HTTP requests)" -> does it mean that you verified it works if you restrict port selection to the one already open? That might also not work if there's an HTTP server actively polling for new invocations (it might read the incoming TCP data), but it will work if the server does not poll while function is executing.

@PranayB003
Copy link

@mcopik

The only important factor was that if you try the VM-TCP connection, the VM needs to have its security policies updated such that it allows all incoming connections on ports

Yep, I've done this in my evaluation too. However, GCP seems to be different from AWS in that the Cloud Run instance is unable to accept incoming connections even after the hole is punched (since the request/response to/from the hole-punching server was successful).

does it mean that you verified it works if you restrict port selection to the one already open?

Not yet, I was reading the docs and other sources to find out whether subsequent requests to the same Cloud Run instance IP (on the open HTTP port) would be:

  • intercepted by a polling HTTP server
  • handed to a new program invocation within the same Cloud Run instance
  • handled by the function code (no HTTP server polling for requests while the function is executing)

I did not find any concrete mention of these mechanisms in the docs or elsewhere, so I'll have to just try it out practically. I have end-semester exams presently, so I haven't been able to devote time to this for the past week. I'll get back to working on this after 2nd May.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Development

No branches or pull requests

2 participants