Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[vsock] tx/rx hungup after large packet is delivered #494

Closed
ikicha opened this issue Oct 23, 2023 · 23 comments · Fixed by #506
Closed

[vsock] tx/rx hungup after large packet is delivered #494

ikicha opened this issue Oct 23, 2023 · 23 comments · Fixed by #506

Comments

@ikicha
Copy link
Contributor

ikicha commented Oct 23, 2023

guest side: while true; do ncat -e /bin/cat --vsock -l 1234 -k; done;
host side:

import socket
import os

socket_path = '/tmp/vm3.vsock'
client = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
client.connect(socket_path)
message = 'connect 1234\n'
client.sendall(message.encode())
response = client.recv(1024)
print(response)
i = 0
while True:
  message = ' '*1024*1024
  client.sendall(message.encode())
  response = client.recv(1024*1024)
  print(response, i)
  i += 1

client.close()

When I did stress test for vhost-device-vsock, I found a bug with large rx packet. It was stuck after VSOCK_OP_CREDIT_REQUEST in VsockConnection.recv_pkt(), (and it looks like there is no VSOCK_OP_CREDIT_UPDATE in tx) And it happens both in qemu and crosvm. I was curious if similar bugs happened before. (BTW, I found https://lore.kernel.org/netdev/[email protected]/ which is similar)

@stefano-garzarella
Copy link
Member

Yep, @slp saw something similar and also #384 could be related.

It looks like we don't have this problem using in-kernel vhost-vsock device. Can you confirm this?

Anyway, since the device is fully emulated here, IMHO it is expected that it will happen with any VMM.

@ikicha
Copy link
Contributor Author

ikicha commented Oct 23, 2023

I ran qemu with -device vhost-vsock-pci,id=vhost-vsock-pci0,guest-cid=3(it uses in-kernel vhost-vsock, right?), and a similar bug happened.

@stefano-garzarella
Copy link
Member

I ran qemu with -device vhost-vsock-pci,id=vhost-vsock-pci0,guest-cid=3(it uses in-kernel vhost-vsock, right?),

right!

and a similar bug happened.

mmm, so it could be a problem in the driver then.

Which host & guest kernel are you using?

@ikicha
Copy link
Contributor Author

ikicha commented Oct 23, 2023

for host, I tested with the latest one, 6.5? and for guest, I used 5.4(based on ubuntu 20.04)

@ikicha
Copy link
Contributor Author

ikicha commented Oct 23, 2023

both guest->host and host->guest have the problem.

@stefano-garzarella
Copy link
Member

@ikicha thanks, replicated!

We need to check better what is going on with credits and socket buffer. Could it be that ncat does not set the socket buffer properly and waits for the whole message instead of reading it in pieces?

@ikicha
Copy link
Contributor Author

ikicha commented Oct 23, 2023

I suspected credits.. because there is no response of CREDIT REQUEST. QQ: what components negotiate credit with each others? vhost-vsock and vhost-device-vsock? or vmm and vhost-device-vsock?

@ikicha ikicha changed the title [vsock] tx/rx hungup after large rx packet [vsock] tx/rx hungup after large packet is delivered Oct 24, 2023
@stefano-garzarella
Copy link
Member

I suspected credits.. because there is no response of CREDIT REQUEST.

I can't see a CREDIT REQUEST using vhost-vsock but could be related, I agree.

QQ: what components negotiate credit with each others? vhost-vsock and vhost-device-vsock? or vmm and vhost-device-vsock?

The VMM is not involved at all in the data path with both vhost-vsock and vhost-device-vsock. It is involved only for the setup and to handle the event queue used to send event to the guest, mostly about for live migration (reset all connections).

So the credit information is exchanged between the driver and the device, usually in every packet. So, if there is traffic, they don't need to explicitly send a CREDIT REQUEST, since every packet header contains the credit info.

@stefano-garzarella
Copy link
Member

@ikicha can you try increasing the vsock buffer size in this way:

client.setsockopt(socket.AF_VSOCK, socket.SO_VM_SOCKETS_BUFFER_MAX_SIZE, 2*1024*1024)
client.setsockopt(socket.AF_VSOCK, socket.SO_VM_SOCKETS_BUFFER_SIZE, 2*1024*1024)

@stefano-garzarella
Copy link
Member

For vhost-device-vsock we have the tx-buffer-size= parameter that you can use for the same purpose.

@stefano-garzarella
Copy link
Member

In the guest, I don't know if ncat is setting it properly, so I used this:

import socket
import os

server = socket.socket(socket.AF_VSOCK, socket.SOCK_STREAM)
server.bind((socket.VMADDR_CID_ANY, 1234))
server.listen()

client, _ = server.accept()

client.setsockopt(socket.AF_VSOCK, socket.SO_VM_SOCKETS_BUFFER_MAX_SIZE, 2*1024*1024)
client.setsockopt(socket.AF_VSOCK, socket.SO_VM_SOCKETS_BUFFER_SIZE, 2*1024*1024)

i = 0
while True:
  print(i, ": waiting response")
  response = client.recv(1024*1024)
  print(i, ": received ", len(response), "bytes")
  if not response:
    break
  print(i, ": sending ", len(response), "bytes")
  client.sendall(response)
  i += 1

client.close()

@ikicha
Copy link
Contributor Author

ikicha commented Oct 24, 2023

I tested it with vhost-vsock, and it mitigates the symptom, but it happens again as packet size increases. The interesting thing is both host and guest is stuck at client.send(or client.sendall) block.

@stefano-garzarella
Copy link
Member

If you don't want to increase the buffer size, what you need to do is have another thread reading.

Otherwise what happens here is the following:

The host wants to send 1 MB, the guest sends the data back, but when it gets to 256 K (default vsock buffer size) it deadlocks because the host buffer is full since no one is reading data from the socket.

So the host should have another thread reading as you send. This works fine if the code I sent here runs in the guest (even without increasing the buffer size), while with ncat it doesn't work, I think there is some such thing in ncat too

@stefano-garzarella
Copy link
Member

I tested it with vhost-vsock, and it mitigates the symptom, but it happens again as packet size increases. The interesting thing is both host and guest is stuck at client.send(or client.sendall) block.

You need to set the buffer accordingly if you don't read from it with another thread.

@stefano-garzarella
Copy link
Member

I just tried these 2 scripts with vhost-vsock, without increasing the vscok buffer, but just adding the reader thread in the host.
They are running for 2 minutes without stalling.

  • host
import threading
import socket
import os

class thread(threading.Thread):
  def __init__(self, client):
    threading.Thread.__init__(self)
    self.client = client

  def run(self):
    i = 0
    while True:
      print(i, ": waiting response")
      response = self.client.recv(1024*1024)
      if not response:
        break
      print(i, ": received ", len(response), "bytes")
      i += 1

client = socket.socket(socket.AF_VSOCK, socket.SOCK_STREAM)
client.connect((3, 1234))

message = 'connect 1234\n'
client.sendall(message.encode())
response = client.recv(1024)
print("response:", response)

thread = thread(client)
thread.start()

i = 0
while True:
  message = ' '*1024*1024*10
  print(i, ": sending ", len(message), "bytes")
  client.sendall(message.encode())
  i += 1

client.close()
thread.join()
  • guest
import socket
import os

server = socket.socket(socket.AF_VSOCK, socket.SOCK_STREAM)
server.bind((socket.VMADDR_CID_ANY, 1234))
server.listen()

client, _ = server.accept()

i = 0
while True:
  print(i, ": waiting response")
  response = client.recv(1024*1024)
  print(i, ": received ", len(response), "bytes")
  if not response:
    break
  print(i, ": sending ", len(response), "bytes")
  client.sendall(response)
  i += 1

client.close()

@ikicha
Copy link
Contributor Author

ikicha commented Oct 24, 2023

Thanks for sharing, I confirm that it works for vhost-vsock. But it still doesn't work with vhost-user-vsock. do we need a kind of separate threads for tx and rx in vhost-user-vsock?(like proposal #323 ?)

@stefano-garzarella
Copy link
Member

Thanks for sharing, I confirm that it works for vhost-vsock.

Thanks for confirming!

But it still doesn't work with vhost-user-vsock. do we need a kind of separate threads for tx and rx in vhost-user-vsock?(like proposal #323 ?)

Yes, perhaps that might be the case. As a workaround, have you tried increasing --tx-buffer-size parameter of vhost-device-vsock?

@ikicha
Copy link
Contributor Author

ikicha commented Oct 24, 2023

Increasing tx buf size just postpones the error. In guest, the program hung up at client.sendall(response). So I just tried add timeout before client.sendall(response) to check hypothesis about vsock buf, and it works.

I think vhost-device-vsock needs a kind of flow control logic, and I was curious if vhost-vsock has something like that. (because it doesn't happened in vhost-vosck..)

(BTW, the counter part of vhost-device-vsock in vhost-vsock is drivers/vhost/vsock.c, isn't it?)

@stefano-garzarella
Copy link
Member

Increasing tx buf size just postpones the error. In guest, the program hung up at client.sendall(response). So I just tried add timeout before client.sendall(response) to check hypothesis about vsock buf, and it works.

I think vhost-device-vsock needs a kind of flow control logic, and I was curious if vhost-vsock has something like that. (because it doesn't happened in vhost-vosck..)

Yeah, we need to check better that part.
The main difference is that in vhost-device-vsock, we also have the unix socket and so I think there is the problem, between the unix socket and the virtio device. If you have time, it would be great to be able to solve it ;-)

(BTW, the counter part of vhost-device-vsock in vhost-vsock is drivers/vhost/vsock.c, isn't it?)

Right.

@stefano-garzarella
Copy link
Member

@ikicha ah, maybe increasing the vsock buffer in guest.py in #494 (comment) will help, since I guess the problem in vhost-device-vsock could be similar to the problem in the python script of #494 (comment)

@stefano-garzarella
Copy link
Member

@ikicha FYI there was a potential un-init field in virtio-vsock common code (host-guest) affecting buf_alloc and fwd_cnt, so this could also be related to this issue.

Introduced in Linux v6.3.

@ikicha
Copy link
Contributor Author

ikicha commented Oct 27, 2023

Thanks for sharing! It might a part of problem, but I think I found the root cause.

There is silent return via error in epoll_register between reading from uds and writing it into vq in conn.recv_pkt, it causes packet drop.

@ikicha
Copy link
Contributor Author

ikicha commented Oct 27, 2023

I'll send a pull request about that soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants