Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow iperf3 over Calico VXLAN network #9645

Open
vmax opened this issue Dec 25, 2024 · 8 comments
Open

Slow iperf3 over Calico VXLAN network #9645

vmax opened this issue Dec 25, 2024 · 8 comments

Comments

@vmax
Copy link

vmax commented Dec 25, 2024

Expected Behavior

Speed test results roughly match between Calico and direct connection

Current Behavior

Speed test over iperf3 is either abysmally slow (in Kbit/s range) or connectivity is non-existent

Possible Solution

To debug, I've tried establishing a direct VXLAN tunnel between two nodes with a different VNI and IP range and the speed matched that of the direct connection:

node1:

ip link add vxlan42 type vxlan id 42 dev eth0 dstport 5000
ip addr add 192.168.100.1/24 dev vxlan42
bridge fdb add 00:00:00:00:00:00 dst NODE2PUBLICIP dev vxlan42
ip link set vxlan42 up

node2:

ip link add vxlan42 type vxlan id 42 dev eno1 dstport 5000
ip addr add 192.168.100.2/24 dev vxlan42
bridge fdb add 00:00:00:00:00:00 dst NODE1PUBLICIP dev vxlan42
ip link set vxlan42 up

test results:

[  5] local 192.168.100.1 port 5201 connected to 192.168.100.2 port 60808
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  79.1 MBytes   663 Mbits/sec                  
[  5]   1.00-2.00   sec  90.6 MBytes   760 Mbits/sec                  
[  5]   2.00-3.00   sec  78.8 MBytes   661 Mbits/sec                  
[  5]   3.00-4.00   sec  71.4 MBytes   599 Mbits/sec                  
[  5]   4.00-5.00   sec  74.8 MBytes   627 Mbits/sec                  
[  5]   5.00-6.00   sec  77.1 MBytes   647 Mbits/sec                  
[  5]   6.00-7.00   sec  58.6 MBytes   492 Mbits/sec                  
[  5]   7.00-8.00   sec  61.2 MBytes   514 Mbits/sec                  
[  5]   8.00-9.00   sec  63.4 MBytes   532 Mbits/sec                  
[  5]   9.00-10.00  sec  51.1 MBytes   429 Mbits/sec                  
[  5]  10.00-10.01  sec   384 KBytes   258 Mbits/sec                  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.01  sec   706 MBytes   592 Mbits/sec                  receiver

Steps to Reproduce (for bugs)

  1. Two nodes, one have vxlan.calico address 10.233.107.0/32; other one has 10.233.123.0/32; they both are also accessible over public Internet
  2. iperf3 -s on first node
  3. iperf3 -c 10.233.107.0 on the second node:
Connecting to host 10.233.107.0, port 5201
[  5] local 10.233.123.0 port 45674 connected to 10.233.107.0 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   105 KBytes   860 Kbits/sec    2   13.7 KBytes       
[  5]   1.00-2.00   sec  0.00 Bytes  0.00 bits/sec    0   13.7 KBytes       
[  5]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec    0   13.7 KBytes       
[  5]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec    0   13.7 KBytes       
[  5]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec    0   13.7 KBytes       
[  5]   5.00-6.00   sec  0.00 Bytes  0.00 bits/sec    6   4.10 KBytes       
[  5]   6.00-7.00   sec  0.00 Bytes  0.00 bits/sec    0   4.10 KBytes       
[  5]   7.00-8.00   sec  0.00 Bytes  0.00 bits/sec    0   4.10 KBytes       
[  5]   8.00-9.00   sec  0.00 Bytes  0.00 bits/sec    0   4.10 KBytes       
[  5]   9.00-10.00  sec  0.00 Bytes  0.00 bits/sec    0   4.10 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   105 KBytes  86.1 Kbits/sec    8             sender
[  5]   0.00-10.01  sec  4.10 KBytes  3.35 Kbits/sec                  receiver
  1. iperf3 -c NODE1PUBLICIP from the second node:
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  80.2 MBytes   672 Mbits/sec  6987    452 KBytes       
[  5]   1.00-2.00   sec  33.8 MBytes   283 Mbits/sec    0    508 KBytes       
[  5]   2.00-3.00   sec  38.8 MBytes   325 Mbits/sec    0    563 KBytes       
[  5]   3.00-4.00   sec  41.2 MBytes   346 Mbits/sec    0    617 KBytes       
[  5]   4.00-5.00   sec  46.2 MBytes   388 Mbits/sec    0    673 KBytes       
[  5]   5.00-6.00   sec  48.8 MBytes   409 Mbits/sec    0    725 KBytes       
[  5]   6.00-7.00   sec  52.5 MBytes   440 Mbits/sec    0    779 KBytes       
[  5]   7.00-8.00   sec  56.2 MBytes   472 Mbits/sec    0    834 KBytes       
[  5]   8.00-9.00   sec  61.2 MBytes   514 Mbits/sec    0    888 KBytes       
[  5]   9.00-10.00  sec  65.0 MBytes   545 Mbits/sec    0    945 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   524 MBytes   439 Mbits/sec  6987             sender
[  5]   0.00-10.01  sec   522 MBytes   437 Mbits/sec                  receiver

Context

Network connectivity between pods on some nodes in our cluster is disrupted. We couldn't quite isolate which nodes are affected; one hypothesis is that they are on Hetzner and have ARM architecture. Attached is the pcap of the failing test (a lot of TCP retransmissions there).
calico-issue.pcap.gz

Would be happy with any debugging tips!

Your Environment

  • Calico version: v3.27.3
  • Calico dataplane (iptables, windows etc.): iptables, VXLAN overlay
  • Orchestrator version (e.g. kubernetes, mesos, rkt): k8s
  • Operating System and version: Ubuntu 22.04.3 LTS
  • Link to your project (optional):
@caseydavenport
Copy link
Member

Some things to look into:

@vmax
Copy link
Author

vmax commented Jan 13, 2025

@caseydavenport thank you for your response! I tried :

  1. Lowering MTU significantly (to 1320) - didn't affect it
  2. ChecksumOffloadBroken=true - I believe it should auto-disable it?

@mazdakn
Copy link
Member

mazdakn commented Jan 14, 2025

ChecksumOffloadBroken=true - I believe it should auto-disable it?

@vmax yes, setting the key to true disables it. are you saying that you disabled it and still got the same results?

@vmax
Copy link
Author

vmax commented Jan 15, 2025

@mazdakn I've tried creating a brand new VM in Hetzner with the same config (cpu/ram/region/ipv4+ipv6 connectivity), join it to the same cluster and re-do the speed tests. Unfortunately the result stays the same — even if I try disabling all of the offload features:

root@am-hetzner001:~# ethtool -k eth0 | grep offload
tcp-segmentation-offload: off
generic-segmentation-offload: off
generic-receive-offload: off
large-receive-offload: off [fixed]
rx-vlan-offload: off [fixed]
tx-vlan-offload: off [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: off [fixed]
esp-hw-offload: off [fixed]
esp-tx-csum-hw-offload: off [fixed]
rx-udp_tunnel-port-offload: off [fixed]
tls-hw-tx-offload: off [fixed]
tls-hw-rx-offload: off [fixed]
macsec-hw-offload: off [fixed]
hsr-tag-ins-offload: off [fixed]
hsr-tag-rm-offload: off [fixed]
hsr-fwd-offload: off [fixed]
hsr-dup-offload: off [fixed]

Anything else I could run/check/provide to help debug the issue?

@tomastigera
Copy link
Contributor

tomastigera commented Jan 15, 2025

ChecksumOffloadBroken=true has a huge negative impact on performance over vxlan. As @caseydavenport mentioned above, this issue was addressed in 3.29 by #9091 and the fix was backported to 3.28. I recomend you upgrade to at least the latest 3.28. You should set ChecksumOffloadBroken=false if it is set explicitly (it is false by default now) and if offloding does not change automatically, either reboot your nodes or use ethtool -K calico.vxlan tx-checksumming on. Note that the device is the calico vxlan device, not the eth0.

@vmax
Copy link
Author

vmax commented Jan 16, 2025

@tomastigera I did the following:

  1. Set ChecksumOffloadBroken=false in FelixConfiguration for nodes running iperf3 client / server
  2. On both nodes, I ran ethtool -K vxlan.calico tx-checksumming on, it resulted in:
Actual changes:
tx-checksum-ipv4: off [requested on]
tx-checksum-ipv6: off [requested on]
tx-checksum-fcoe-crc: off [requested on]
tx-checksum-sctp: off [requested on]

Current values are:

# ethtool -k vxlan.calico | grep tx-chec
tx-checksumming: on
	tx-checksum-ipv4: off [fixed]
	tx-checksum-ip-generic: on
	tx-checksum-ipv6: off [fixed]
	tx-checksum-fcoe-crc: off [fixed]
	tx-checksum-sctp: off [fixed]
  1. Restarted calico-node
  2. Reran iperf3:
am-playwright-3:  Connecting to host 10.233.100.4, port 5201
am-playwright-3:  [  5] local 10.233.123.244 port 37704 connected to 10.233.100.4 port 5201
am-playwright-3:  [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
am-playwright-3:  [  5]   0.00-1.00   sec  75.1 KBytes   615 Kbits/sec    1   13.7 KBytes       
am-playwright-3:  [  5]   1.00-2.00   sec  0.00 Bytes  0.00 bits/sec    0   13.7 KBytes       
am-playwright-3:  [  5]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec    0   13.7 KBytes       
am-playwright-3:  [  5]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec    0   13.7 KBytes       
am-playwright-3:  [  5]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec    0   13.7 KBytes       
am-playwright-3:  - - - - - - - - - - - - - - - - - - - - - - - - -
am-playwright-3:  [ ID] Interval           Transfer     Bitrate         Retr
am-playwright-3:  [  5]   0.00-5.00   sec  75.1 KBytes   123 Kbits/sec    1             sender
am-playwright-3:  [  5]   0.00-5.01   sec  0.00 Bytes  0.00 bits/sec                  receiver
am-playwright-3:  
am-playwright-3:  iperf Done.

@tomastigera
Copy link
Contributor

@vmax hmmm interesting. I know we can surely do gigabits. Is the MTU set correctly as @caseydavenport mentioned above?

@vmax
Copy link
Author

vmax commented Jan 16, 2025

@tomastigera main interface (eth0/eno1)'s MTU is 1500, vxlan.calico's is 1450 (for VXLAN mode)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants