Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Talos 1.8.3 advertising virtual MAC addresses #9837

Closed
dbackeus opened this issue Nov 29, 2024 · 40 comments
Closed

Talos 1.8.3 advertising virtual MAC addresses #9837

dbackeus opened this issue Nov 29, 2024 · 40 comments
Assignees

Comments

@dbackeus
Copy link
Contributor

dbackeus commented Nov 29, 2024

Bug Report

After upgrading some of our nodes from Talos 1.8.2 to 1.8.3 we've received MAC address abuse reports from Hetzner (we're deploying on Hetzner dedicated servers).

The reports state:

We have detected that your server is using different MAC addresses from those allowed by your Robot account.

And proceed to give a list of unallowed MAC addresses used, a link to execute a re-check after resolving the issue, and a link to make a statement of why the MAC abuse occurred.

We've been running this Talos cluster on Hetzner for about two years and have only gotten reports like this in the last week after upgrading to Talos 1.8.3. So this upgrade is the only discrepancy we have to go on right now.

Description

Timeline:

  • Sunday 24th November 20:07 - Upgraded control-plane-1
  • Monday 25th November 02:05 - Received abuse report for control-plane-1 from Hetzner
   Allowed MACs:
       a8:a1:59:xx:xx:xx
   Unallowed MACs:
       00:12:40:11:63:e1
       20:00:40:11:43:db
       20:06:40:11:43:d5
       20:0c:40:11:43:cf
  • Tuesday 25th November 22:44 - Upgraded control-plane-2
  • Wednesday 27th November 02:07 - Received abuse report for control-plane-1 from Hetzner
   Allowed MACs:
       a8:a1:59:xx:xx:xx
   Unallowed MACs:
       00:12:40:11:99:0d
       20:00:40:11:79:07
       20:06:40:11:79:01
       20:0c:40:11:78:fb
  • Thursday 28th November 14:39 - Upgraded mixed-2
  • Thursday 28th November 15:12 - Received abuse report for mixed-2 from Hetzner
   Allowed MACs:
       6c:fe:54:xx:xx:xx
       d0:46:0c:xx:xx:xx
   Unallowed MACs:
       00:18:40:11:14:f0
       20:00:40:11:f4:ef
       20:06:40:11:f4:e9
       20:0c:40:11:f4:e3
       20:12:40:11:f4:dd

Our network config is extremely simple. We run the default flannel CNI, get a public IP from Hetzner via DHCP and enable Kubespan, eg:

  network:
    hostname: mixed-2
    interfaces:
      - interface: enp1s0f0
        dhcp: true
    kubespan:
      enabled: true

We've introspected the Talos network via talosctl get links, which does show a lot of veth devices, however none of them have matched the MAC addresses reported by Hetzner. We assume that the intention is for the veth devices is to stay internal to the cluster network and not be advertised on the physical network.

While our "mixed" worker is running all kinds of workloads. The 2 control plane nodes are tainted as control planes and are not running anything out of the ordinary.

When clicking the "re-check" link the report we get back is that the issue has been resolved. So these issues appears to have been transient. It's unclear if they can occur again, eg. when rebooting the nodes or similar. We don't know how to reproduce, or even how to monitor if unallowed MAC addresses are being advertised.

We don't mind spending time further troubleshooting this if someone can guide us in what to do.

For now we'll send a statement to Hetzner about the little we know, including a link to this issue, and hold off upgrading any other nodes for the time being.

Environment

  • Talos version: 1.8.3
  • Kubernetes version: v1.30.6
  • Platform: Hetzner Dedicated Servers
@smira
Copy link
Member

smira commented Nov 29, 2024

First, please check MetalLB or anything else you're running on your host network.

Talos only does forced advertisement for VIPs, but they are advertised with MAC address of the link.

@dbackeus
Copy link
Contributor Author

dbackeus commented Dec 3, 2024

The pods running with hostNetwork: true are...

Managed by Talos:

kube-system/kube-apiserver-control-plane
kube-system/kube-controller-manager-control-plane
kube-system/kube-system kube-scheduler-control-plane
kube-system/kube-flannel
kube-system/kube-proxy

Added by us:

openebs/openebs-ndm

Via: openebs-dynamic-localpv-provisioner

monitoring/monitoring-prometheus-node-exporter
monitoring/metrics-proxy

Via: kube-prometheus-stack

logging/vector

For ingesting system logs into Loki, as suggested by Talos documentation here: https://www.talos.dev/v1.8/talos-guides/configuration/logging/#vector-example

We are not using MetalLB as we are relying on Cloudflare tunnels for HTTP ingress, and NodePort for a pair of Postgres databases.

Note that all of this has been running in our cluster since day one, and appear to be running fine on Talos 1.8.2 without triggering any MAC abuse reports.

@dbackeus
Copy link
Contributor Author

dbackeus commented Dec 3, 2024

As we got another round of abuse reports for these servers, as well as one additional worker node which had also been upgraded to Talos 1.8.3, we have now downgraded all nodes to 1.8.2 to see if this prevents more reports from triggering.

@m4xmorris
Copy link

As we got another round of abuse reports for these servers, as well as one additional worker node which had also been upgraded to Talos 1.8.3, we have now downgraded all nodes to 1.8.2 to see if this prevents more reports from triggering.

Any update on whether this worked for you? Just stood up a cluster on 1.8.3 in Hetzner to run into this issue...

@dbackeus
Copy link
Contributor Author

dbackeus commented Dec 24, 2024

Yes, downgrading to 1.8.2 stopped triggering the MAC abuse reports!

We have yet to try out 1.9.0 in case that has somehow rectified the issues again. Would be interesting to know.

@dbackeus
Copy link
Contributor Author

@m4xmorris did you try running 1.9.x yet by any chance?

We'd like to upgrade but are living in fear 😅

@m4xmorris
Copy link

m4xmorris commented Jan 21, 2025

Afraid the issue still appears to be present in 1.9.x🙃 Hetzner reports continued to come in after upgrading

@fmei-dm
Copy link

fmei-dm commented Feb 6, 2025

Same Problem here :(

@dbackeus
Copy link
Contributor Author

dbackeus commented Feb 6, 2025

@m4xmorris @fmei-dm could you also do an inventory of pods running on hostNetwork as I did here?

Maybe we'll find a pattern 🤞

@fmei-dm
Copy link

fmei-dm commented Feb 6, 2025

I did some research but found no explicit root cause. What I've found is, that between talos Version 1.8.2 & 1.8.3 there is a new version of Flannel in use (ghcr.io/siderolabs/flannel:v0.25.7). See https://github.com/siderolabs/talos/releases/tag/v1.8.3

In this version of Flannel there is a new major version of netlink in use (v1.3.0) See flannel-io/flannel@bfb3669

I've scrolled through the diff between the two versions which were used in Talos 1.8.2 and Talos 1.8.3 but could not find anything which is clearly causing the problem. See here vishvananda/netlink@v1.2.1-beta.2...v1.3.0 but somehow I have the strong suspection it could be related to the new version of flannel.

Regarding your question: We do not do HostNetworking at all - except the standard services which are part of the default Talos installation. Nevertheless I will have a look and post the results.

@fmei-dm
Copy link

fmei-dm commented Feb 6, 2025

Here are all pods on the affected cluster with Hostnetworking enabled. Since also worker-nodes are affected, only kube-proxy, kube-flannel or datadog agent can be the root cause. Since you are not using datadog-agent only kube-proxy or kube-flannel can be the root-cause.

Does anyone know, if it is possible to manually downgrade flannel on talos?

kubectl get pods -A -o json | jq -r '.items[] | select(.spec.hostNetwork == true) | .metadata.namespace + "/" + .metadata.name'

datadog/datadog-agent-h5pqn
datadog/datadog-agent-hx76t
datadog/datadog-agent-k5t6n
datadog/datadog-agent-kggq9
datadog/datadog-agent-m5xws
datadog/datadog-agent-mnnf7
datadog/datadog-agent-pzqhn
datadog/datadog-agent-q9jbk
datadog/datadog-agent-slbkb
datadog/datadog-agent-wc9zq
datadog/datadog-agent-wvx5k
datadog/datadog-agent-xjbbg
kube-system/kube-apiserver-teroknor-cp-1
kube-system/kube-apiserver-teroknor-cp-2
kube-system/kube-apiserver-teroknor-cp-3
kube-system/kube-controller-manager-teroknor-cp-1
kube-system/kube-controller-manager-teroknor-cp-2
kube-system/kube-controller-manager-teroknor-cp-3
kube-system/kube-flannel-9jlv7
kube-system/kube-flannel-9tvqc
kube-system/kube-flannel-b6nzn
kube-system/kube-flannel-f6pjr
kube-system/kube-flannel-gvn6c
kube-system/kube-flannel-lzc88
kube-system/kube-flannel-n5xp2
kube-system/kube-flannel-p2q4v
kube-system/kube-flannel-pwgkc
kube-system/kube-flannel-qpv69
kube-system/kube-flannel-qsxcz
kube-system/kube-flannel-v76x4
kube-system/kube-proxy-2cw8t
kube-system/kube-proxy-2xwzg
kube-system/kube-proxy-59ktk
kube-system/kube-proxy-6bpv9
kube-system/kube-proxy-75tsz
kube-system/kube-proxy-8d9ss
kube-system/kube-proxy-gktvv
kube-system/kube-proxy-gw7rn
kube-system/kube-proxy-ksq7t
kube-system/kube-proxy-pb4r4
kube-system/kube-proxy-wpxwv
kube-system/kube-proxy-xn64n
kube-system/kube-scheduler-teroknor-cp-1
kube-system/kube-scheduler-teroknor-cp-2
kube-system/kube-scheduler-teroknor-cp-3

@smira
Copy link
Member

smira commented Feb 6, 2025

Does anyone know, if it is possible to manually downgrade flannel on talos?

You can by disabling Talos CNI and deploying your own Flannel they way you'd like. As a quick hack you can change versions in the DaemonSet.

@m4xmorris
Copy link

m4xmorris commented Feb 6, 2025

I'm no longer running a cluster in Hetzner so not able to help too much, but, I was having this issue when using Cilium (with kube-proxyreplacement) instead of the default Flannel. That and kube-prometheus-stack are the only things with hostNetwork I deployed outside of any Talos defaults.

@fmei-dm
Copy link

fmei-dm commented Feb 6, 2025

Does anyone know, if it is possible to manually downgrade flannel on talos?

You can by disabling Talos CNI and deploying your own Flannel they way you'd like. As a quick hack you can change versions in the DaemonSet.

Many thanks. I think I can do that - but I don't know how to reproduce the error - because I got Abuse reports for 4 out of 9 K8s nodes. So even if I get no abuse notification, I can not be sure if it worked or not. What I can try is to spawn a tcpdump on every K8s worker and listen for all ethernet frames which do not originate from the device mac address. Maybe I am lucky and capture some of the bad packets. Anybody a better idea?

@smira
Copy link
Member

smira commented Feb 6, 2025

The only way Talos Linux advertises IPs from userspace is a Layer 2 VIP (not Hetzner VIP even), but it would advertise VIP address, not a random IP.

My only guess is that these advertisements somehow slip from the pod networking pods, but that's a wild guess, and not sure how they get to the outbound NIC. But even if they do, totally unclear what kind of a problem it is for Hetzner, a switch should be configured to lock by MAC, so who cares what gets advertised.

@fmei-dm
Copy link

fmei-dm commented Feb 6, 2025

used the following command to start a tcpdump in a screen session on every worker node.

apt update && apt install -y iproute2 tcpdump screen && screen -dmS tcpdump_session bash -c 'tcpdump -i $(ip -o link show | awk -F ": " "/enp/{print \$2; exit}") ether src not $(ip link show | awk "/enp/{getline; print \$2; exit}") and outbound'

@fmei-dm
Copy link

fmei-dm commented Feb 7, 2025

Found the following packets sent out on the ethernet interfaces.

Findings:

  • Ethertype unknown, did some google research, they are in fact unknown. so I suspect this is some garbage, which is not even a correctly formatted ethernet frame.
  • In the past I already did a research for the vendor of the used mac addresses - found nothing. That makes sense now - because it seems to be a garbage frame
  • I did not find any indicators who has been sent this frame based on the content
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on enp9s0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
22:07:02.329668 20:00:40:11:18:fb > 45:00:00:44:f4:94, ethertype Unknown (0x58c6), length 68:
        0x0000:  8dda 74ca f1ae ca6c ca6c 0098 969c 0400  ..t....l.l......
        0x0010:  0000 4730 3f18 6800 0000 0000 0000 9971  ..G0?.h........q
        0x0020:  c4c9 9055 a157 0a70 9ead bf83 38ca ab38  ...U.W.p....8..8
        0x0030:  8add ab96 e052                           .....R
22:07:02.329672 20:06:40:11:18:f5 > 45:00:00:44:f4:94, ethertype Unknown (0x58c6), length 68:
        0x0000:  8dda 74ca f1ae 9de9 542b e0dd 00bf 1200  ..t.....T+......
        0x0010:  0a14 f676 c72b 1d88 275b 5edb dcfe 15a2  ...v.+..'[^.....
        0x0020:  f61b 90e2 cff1 857a e56c d10c 253a 8057  .......z.l..%:.W
        0x0030:  b27a e962 e9fd                           .z.b..
22:07:02.329674 20:0c:40:11:18:ef > 45:00:00:44:f4:94, ethertype Unknown (0x58c6), length 68:
        0x0000:  8dda 74ca f1ae a317 8a61 a132 26dc 9d24  ..t......a.2&..$
        0x0010:  709a 2317 b840 723e c8c5 e82f 45d9 321c  p.#..@r>.../E.2.
        0x0020:  cfa6 d389 68d5 492d 24fa 0ac9 e82d 0d32  ....h.I-$....-.2
        0x0030:  efe0 d25e 9baf                           ...^..
22:07:02.329677 00:12:40:11:39:11 > 45:00:00:1c:f4:94, ethertype Unknown (0x58c6), length 28:
        0x0000:  8dda 74ca f1ae 8273 ee74 60e0 7aa0       ..t....s.t`.z.
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on enp6s0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
01:35:17.288872 20:00:40:11:53:f1 > 45:00:00:44:95:ce, ethertype Unknown (0xa237), length 68:
        0x0000:  6839 74ca f1ae ca6c ca6c 0098 d2cf 0400  h9t....l.l......
        0x0010:  0000 5334 db07 0100 0000 0000 0000 3a85  ..S4..........:.
        0x0020:  56cd d55e 1dbd 1dce ea77 1e52 6cae 25c5  V..^.....w.Rl.%.
        0x0030:  d609 40e4 65e5                           [email protected].
01:35:17.288877 20:06:40:11:53:eb > 45:00:00:44:95:ce, ethertype Unknown (0xa237), length 68:
        0x0000:  6839 74ca f1ae 066f 4ee2 bf50 b557 8327  h9t....oN..P.W.'
        0x0010:  eded 039c b1b9 4315 ee67 f1bd e7c9 87fa  ......C..g......
        0x0020:  91c9 fd2d d97f 05be 9719 bec1 b2bd 1100  ...-............
        0x0030:  562a d802 eca1                           V*....
01:35:17.288878 20:0c:40:11:53:e5 > 45:00:00:44:95:ce, ethertype Unknown (0xa237), length 68:
        0x0000:  6839 74ca f1ae a2a4 752d df88 a4a6 fb53  h9t.....u-.....S
        0x0010:  5464 6049 1e6f e4fd 3cd3 9f57 3776 c514  Td`I.o..<..W7v..
        0x0020:  15d2 2ed8 9604 4464 eca7 8b65 3b13 dee2  ......Dd...e;...
        0x0030:  fab7 223a 5bae                           ..":[.
01:35:17.288880 00:12:40:11:74:07 > 45:00:00:1c:95:ce, ethertype Unknown (0xa237), length 28:
        0x0000:  6839 74ca f1ae 2a49 565c dcd5 6346       h9t...*IV\..cF
03:27:02.225153 20:00:40:11:df:d2 > 45:00:00:44:09:ed, ethertype Unknown (0xa237), length 68:
        0x0000:  6839 74ca f1ae ca6c ca6c 0098 5c41 0400  h9t....l.l..\A..
        0x0010:  0000 57ae 3db0 0900 0000 0000 0000 4c99  ..W.=.........L.
        0x0020:  9460 9fc0 203b 6ce1 67f5 4ad9 b8d5 63ea  .`...;l.g.J...c.
        0x0030:  81da 6cd2 c495                           ..l...
03:27:02.225165 20:06:40:11:df:cc > 45:00:00:44:09:ed, ethertype Unknown (0xa237), length 68:
        0x0000:  6839 74ca f1ae 06e4 59d4 9be0 8522 26d6  h9t.....Y...."&.
        0x0010:  93fe 0097 14d5 1ed3 502a 324b 8667 32e0  ........P*2K.g2.
        0x0020:  dfca 0deb 97fc a07f 3539 2ba7 69dd b628  ........59+.i..(
        0x0030:  0132 fb81 5838                           .2..X8
03:27:02.225166 20:0c:40:11:df:c6 > 45:00:00:44:09:ed, ethertype Unknown (0xa237), length 68:
        0x0000:  6839 74ca f1ae b819 2dfb c21b 5be8 65f8  h9t.....-...[.e.
        0x0010:  6755 af64 1c90 ec02 838d b77c c741 6769  gU.d.......|.Agi
        0x0020:  04ab a6d5 a70a 59f1 9909 123a f006 d58f  ......Y....:....
        0x0030:  4b09 cf6e 028b                           K..n..
03:27:02.225168 00:12:40:11:ff:e8 > 45:00:00:1c:09:ed, ethertype Unknown (0xa237), length 28:
        0x0000:  6839 74ca f1ae b8b2 74e6 b0b2 b566       h9t.....t....f
04:58:17.243971 00:00:40:11:9e:88 > 45:00:00:ac:6a:cf, ethertype Unknown (0xa237), length 172:
        0x0000:  6839 74ca f1ae ca6c ca6c 0098 7193 0400  h9t....l.l..q...
        0x0010:  0000 e73c 3fdf 1200 0000 0000 0000 ef27  ...<?..........'
        0x0020:  a6f2 e511 6383 3310 dbb8 0924 0dc7 e747  ....c.3....$...G
        0x0030:  e627 7d1c 8dcf 3521 e231 f369 a77f 3b5e  .'}...5!.1.i..;^
        0x0040:  bc0b 9c82 d66f 220f 4f67 bf82 f488 8d8a  .....o".Og......
        0x0050:  4eb9 b978 c952 4a72 7595 a343 6866 936c  N..x.RJru..Chf.l
        0x0060:  620b ab34 ca5f ebf0 4231 d7b2 e7b9 95fa  b..4._..B1......
        0x0070:  fa7c 1965 8649 a074 27ae 10cc acdc 656c  .|.e.I.t'.....el
        0x0080:  e2dd 89bc da54 c0b2 2fae 7e8b a3b5 9a5b  .....T../.~....[
        0x0090:  8738 9f61 7332 efef 5d9f 5e29 22f2       .8.as2..].^)".
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on enp6s0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
04:10:02.255595 20:00:40:11:a2:95 > 45:00:00:44:95:08, ethertype Unknown (0xb23f), length 68:
        0x0000:  0a53 74ca f1ae ca6c ca6c 0098 907f 0400  .St....l.l......
        0x0010:  0000 951b 8884 1600 0000 0000 0000 cd84  ................
        0x0020:  5542 fd27 66a4 e5ae 54ec 2e40 0abe 45bc  UB.'[email protected].
        0x0030:  28d8 739f 8aa4                           (.s...
04:10:02.255609 20:06:40:11:a2:8f > 45:00:00:44:95:08, ethertype Unknown (0xb23f), length 68:
        0x0000:  0a53 74ca f1ae a2f0 2dea 7a64 d313 0ba2  .St.....-.zd....
        0x0010:  7955 6e15 5cf3 8ccb 2e10 7deb a434 cfb0  yUn.\.....}..4..
        0x0020:  dc2e c994 7c03 a36f 0463 bf13 fcf4 2bb0  ....|..o.c....+.
        0x0030:  e767 c1a6 b6dd                           .g....
04:10:02.255611 20:0c:40:11:a2:89 > 45:00:00:44:95:08, ethertype Unknown (0xb23f), length 68:
        0x0000:  0a53 74ca f1ae 5142 c145 68c1 59f8 09ec  .St...QB.Eh.Y...
        0x0010:  c09c c55e a1f7 3e72 353d 674a 6478 43b5  ...^..>r5=gJdxC.
        0x0020:  edb8 0b27 2e7f aee2 ee8c cbbe 64e5 371b  ...'........d.7.
        0x0030:  b0af ea6f be79                           ...o.y
04:10:02.255613 00:12:40:11:c2:ab > 45:00:00:1c:95:08, ethertype Unknown (0xb23f), length 28:
        0x0000:  0a53 74ca f1ae 2c47 ca08 bd47 276d       .St...,G...G'm
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on enp6s0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
23:18:32.241039 20:00:40:11:d1:59 > 45:00:00:44:66:4c, ethertype Unknown (0xb23f), length 68:
        0x0000:  0a4b 74ca f1ae ca6c ca6c 0098 fdb6 0400  .Kt....l.l......
        0x0010:  0000 0be7 8b78 1501 0000 0000 0000 65ef  .....x........e.
        0x0020:  abca 8431 0156 f144 6e37 b976 e24c 998c  ...1.V.Dn7.v.L..
        0x0030:  1cab a9a3 48c8                           ....H.
23:18:32.241052 20:06:40:11:d1:53 > 45:00:00:44:66:4c, ethertype Unknown (0xb23f), length 68:
        0x0000:  0a4b 74ca f1ae e32e d65f aa11 87c4 aa0f  .Kt......_......
        0x0010:  433a 33ae 0cff 236c 85b9 b252 6390 a2d4  C:3...#l...Rc...
        0x0020:  a153 1579 0e0e 5c54 f60e 35a3 d687 8c6d  .S.y..\T..5....m
        0x0030:  4dbe 2ee3 17bf                           M.....
23:18:32.241054 20:0c:40:11:d1:4d > 45:00:00:44:66:4c, ethertype Unknown (0xb23f), length 68:
        0x0000:  0a4b 74ca f1ae 86ab 3ba4 36b8 047d fa43  .Kt.....;.6..}.C
        0x0010:  b184 11fa 1198 29ff 17c7 061f 0803 8770  ......)........p
        0x0020:  90ce 990c f4c0 ec2a e6c3 076e abfd 0ec4  .......*...n....
        0x0030:  02e6 dc6f 0653                           ...o.S
23:18:32.241056 00:12:40:11:f1:6f > 45:00:00:1c:66:4c, ethertype Unknown (0xb23f), length 28:
        0x0000:  0a4b 74ca f1ae 2ee1 0ca0 40cb e44a       [email protected]
06:24:47.275978 20:00:40:11:9f:00 > 45:00:00:44:98:a5, ethertype Unknown (0xb23f), length 68:
        0x0000:  0a4b 74ca f1ae ca6c ca6c 0098 dfe8 0400  .Kt....l.l......
        0x0010:  0000 eeaf b66e 2a00 0000 0000 0000 f877  .....n*........w
        0x0020:  4b24 45c6 dc5d e835 a93a d342 acdf 1248  K$E..].5.:.B...H
        0x0030:  af7d e105 2c28                           .}..,(
06:24:47.275992 20:06:40:11:9e:fa > 45:00:00:44:98:a5, ethertype Unknown (0xb23f), length 68:
        0x0000:  0a4b 74ca f1ae 6fec ec80 ef42 3f0d 6b12  .Kt...o....B?.k.
        0x0010:  3648 c87b a84a 048c cc26 4b55 b49a c66f  6H.{.J...&KU...o
        0x0020:  a7cb 9641 61c3 70b4 27fe e2d0 5f21 897f  ...Aa.p.'..._!..
        0x0030:  318b 7f60 ef55                           1..`.U
06:24:47.275994 20:0c:40:11:9e:f4 > 45:00:00:44:98:a5, ethertype Unknown (0xb23f), length 68:
        0x0000:  0a4b 74ca f1ae 22e6 97e2 5fec 62f0 e0a9  .Kt..."..._.b...
        0x0010:  8e6f b383 0131 4133 892f f916 f924 baa3  .o...1A3./...$..
        0x0020:  edc1 645e a3b5 1e21 83be 2c36 41c9 f0f8  ..d^...!..,6A...
        0x0030:  04b8 25f4 c37d                           ..%..}
06:24:47.275996 00:12:40:11:bf:16 > 45:00:00:1c:98:a5, ethertype Unknown (0xb23f), length 28:
        0x0000:  0a4b 74ca f1ae 30d3 dbd2 d61b 951a       .Kt...0.......
06:35:02.286007 20:00:40:11:60:0f > 45:00:00:44:d7:96, ethertype Unknown (0xb23f), length 68:
        0x0000:  0a4b 74ca f1ae ca6c ca6c 0098 e35b 0400  .Kt....l.l...[..
        0x0010:  0000 74e3 b309 6100 0000 0000 0000 40b6  ..t...a.......@.
        0x0020:  1d9a 71f6 7989 5b4b 93ad 45d7 72d4 35f2  ..q.y.[K..E.r.5.
        0x0030:  52ae 65d8 6f07                           R.e.o.
06:35:02.286021 20:06:40:11:60:09 > 45:00:00:44:d7:96, ethertype Unknown (0xb23f), length 68:
        0x0000:  0a4b 74ca f1ae b2db ac2a 4cf1 7140 8ac5  .Kt......*L.q@..
        0x0010:  c1e8 29d4 ae98 368a c4cf 55f7 6902 a70c  ..)...6...U.i...
        0x0020:  c4da a803 a057 bc62 9796 ecea 2d3b 83c8  .....W.b....-;..
        0x0030:  07e2 7a57 c63e                           ..zW.>
06:35:02.286022 20:0c:40:11:60:03 > 45:00:00:44:d7:96, ethertype Unknown (0xb23f), length 68:
        0x0000:  0a4b 74ca f1ae 3cdd 2bb4 935a 67b4 9cb5  .Kt...<.+..Zg...
        0x0010:  26d0 e990 dc62 d655 20e2 4228 d74d 53fd  &....b.U..B(.MS.
        0x0020:  916b f123 b57d d528 d0fa 637d 8447 49b1  .k.#.}.(..c}.GI.
        0x0030:  6355 5d5e c652                           cU]^.R
06:35:02.286024 00:12:40:11:80:25 > 45:00:00:1c:d7:96, ethertype Unknown (0xb23f), length 28:
        0x0000:  0a4b 74ca f1ae 6d74 b876 2e5c 5f9e       .Kt...mt.v.\_.

@fmei-dm
Copy link

fmei-dm commented Feb 7, 2025

Checked if a process was created on one of the affected kubernetes around the timestamp of the packets. Not even close. So the packets must have been emitted by a process which is already running. It seems it has nothing to do with some reconfiguration of flannel networking, when creating/evicting pods

@fmei-dm
Copy link

fmei-dm commented Feb 7, 2025

I was completely wrong - Flannel can not be the problem - It was not upgraded from 1.8.2 to 1.8.3. Somehow I made a mistake when comparing the versions.

So I want to check if the linux kernel can be the problem. It has been upgraded from 6.6.58 to 6.6.60. I've checked the changelogs - there are many changes which could have that effect - but that are that many changes that I simply can not check all of them.

@smira is it possible to downgrade the linux kernel to 6.6.58 but keep the rest of talos as is?

@smira
Copy link
Member

smira commented Feb 7, 2025

@smira is it possible to downgrade the linux kernel to 6.6.58 but keep the rest of talos as is?

You can if you do a custom build downgrading pkgs version in Talos.

@fmei-dm
Copy link

fmei-dm commented Feb 7, 2025

Thanks! I now have running talos v1.8.4 with kernel 6.6.58. Let's see what happens.

Little tricky if you don't know it: You have to find the Postfix for the Packages in the commit history of Talos (branch 1.8). See sample commit: 9f62fe9

Also in v1.8.3 there were added 2 kernel modules for Block Device Caching. Had to delete those lines in hack/modules-amd64.txt

make installer INSTALLER_ARCH=amd64 PLATFORM=linux/amd64 PUSH=true USERNAME=fmei-dm PKG_KERNEL=ghcr.io/siderolabs/kernel:v1.8.0-23-g9aac1a8 PKG_KMOD=ghcr.io/siderolabs/kmod:v1.8.0-23-g9aac1a8 TAG_SUFFIX=oldkernel

diff --git a/hack/modules-amd64.txt b/hack/modules-amd64.txt
index 486669371..07fd7e060 100644
--- a/hack/modules-amd64.txt
+++ b/hack/modules-amd64.txt
@@ -66,8 +66,6 @@ kernel/drivers/i2c/i2c-smbus.ko
 kernel/drivers/infiniband/hw/mlx4/mlx4_ib.ko
 kernel/drivers/infiniband/hw/mlx5/mlx5_ib.ko
 kernel/drivers/infiniband/sw/rxe/rdma_rxe.ko
-kernel/drivers/md/dm-cache.ko
-kernel/drivers/md/dm-cache-smq.ko
 kernel/drivers/md/dm-bio-prison.ko
 kernel/drivers/md/dm-multipath.ko
 kernel/drivers/md/dm-raid.ko

@smira
Copy link
Member

smira commented Feb 7, 2025

@fmei-dm also Talos 1.9.x is based on Linux 6.12.x, so the bug might not be there

@fmei-dm
Copy link

fmei-dm commented Feb 7, 2025

@fmei-dm also Talos 1.9.x is based on Linux 6.12.x, so the bug might not be there

@m4xmorris already tried that. Same problem:

Afraid the issue still appears to be present in 1.9.x🙃 Hetzner reports continued to come in after upgrading

@smira
Copy link
Member

smira commented Feb 7, 2025

That's weird... Strange thing that these packets appear to be simply corrupted something (?).

@fmei-dm
Copy link

fmei-dm commented Feb 7, 2025

Absolutely. They seem like garbage. No MAC Address Range which is known (both source and destinaton mac), no known ethertype. I've seen some issue where the vlan tag was placed into ethertype by a bug - but that was on ubuntu - and even then - we don't use VLAN Tagging on these servers on the outgoing interface. So even then it must be some kind of bug. very weird.

@fmei-dm
Copy link

fmei-dm commented Feb 7, 2025

Had another idea: What if this problem is a hardware issue with this new kernel version. So I checked the used network driver in dmesg. Its using the pretty common r8169 module. Exactly this module was patched in Kernel version 6.6.59. There was a discussion about a "issues under heavy load" in https://bugzilla.kernel.org/show_bug.cgi?id=219388.

https://cdn.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.6.59:

commit 7d6d46b429804b1a182106e27e2f8c0e84689e1a
Author: Heiner Kallweit <[email protected]>
Date:   Fri Oct 18 11:08:16 2024 +0200

    r8169: avoid unsolicited interrupts
    
    [ Upstream commit 10ce0db787004875f4dba068ea952207d1d8abeb ]
    
    It was reported that after resume from suspend a PCI error is logged
    and connectivity is broken. Error message is:
    PCI error (cmd = 0x0407, status_errs = 0x0000)
    The message seems to be a red herring as none of the error bits is set,
    and the PCI command register value also is normal. Exception handling
    for a PCI error includes a chip reset what apparently brakes connectivity
    here. The interrupt status bit triggering the PCI error handling isn't
    actually used on PCIe chip versions, so it's not clear why this bit is
    set by the chip. Fix this by ignoring this bit on PCIe chip versions.
    
    Fixes: 0e4851502f84 ("r8169: merge with version 8.001.00 of Realtek's r8168 driver")
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219388
    Tested-by: Atlas Yu <[email protected]>
    Signed-off-by: Heiner Kallweit <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

@smira
Copy link
Member

smira commented Feb 7, 2025

Yes, this is a great find! In general, Realtek drivers are problematic.

Talos 1.10 should come with Ethernet low-level configuration, which might allow you to disable e.g. checksum offloading if that might help to workaround the issue (?).

@fmei-dm
Copy link

fmei-dm commented Feb 7, 2025

Can't be r8169, same problem on other cluster with e1000e interfaces. meh.

@fmei-dm
Copy link

fmei-dm commented Feb 8, 2025

The problem seems to be gone when using the 6.6.58 kernel. So it must be a change in 6.6.59 or 6.6.60.

What now? @smira: Do you have any idea how to identify the relevant patch? I've searched the kernel logs for changes which belong to the network stack - but do not have to do with a specific network hardware. I identified one which might be relevant - but have no idea if I am right. Do you guys (from talos) have contact to kernel developers for the networking stack?

The patch which might be the problem is:

commit a7bdb199784fce9584144b51246abd2e01ddc206
Author: Eric Dumazet <[email protected]>
Date:   Tue Oct 15 19:41:18 2024 +0000

    net: fix races in netdev_tx_sent_queue()/dev_watchdog()
    
    [ Upstream commit 95ecba62e2fd201bcdcca636f5d774f1cd4f1458 ]
    
    Some workloads hit the infamous dev_watchdog() message:
    
    "NETDEV WATCHDOG: eth0 (xxxx): transmit queue XX timed out"
    
    It seems possible to hit this even for perfectly normal
    BQL enabled drivers:
    
    1) Assume a TX queue was idle for more than dev->watchdog_timeo
       (5 seconds unless changed by the driver)
    
    2) Assume a big packet is sent, exceeding current BQL limit.
    
    3) Driver ndo_start_xmit() puts the packet in TX ring,
       and netdev_tx_sent_queue() is called.
    
    4) QUEUE_STATE_STACK_XOFF could be set from netdev_tx_sent_queue()
       before txq->trans_start has been written.
    
    5) txq->trans_start is written later, from netdev_start_xmit()
    
        if (rc == NETDEV_TX_OK)
              txq_trans_update(txq)
    
    dev_watchdog() running on another cpu could read the old
    txq->trans_start, and then see QUEUE_STATE_STACK_XOFF, because 5)
    did not happen yet.
    
    To solve the issue, write txq->trans_start right before one XOFF bit
    is set :
    
    - _QUEUE_STATE_DRV_XOFF from netif_tx_stop_queue()
    - __QUEUE_STATE_STACK_XOFF from netdev_tx_sent_queue()
    
    From dev_watchdog(), we have to read txq->state before txq->trans_start.
    
    Add memory barriers to enforce correct ordering.
    
    In the future, we could avoid writing over txq->trans_start for normal
    operations, and rename this field to txq->xoff_start_time.
    
    Fixes: bec251bc8b6a ("net: no longer stop all TX queues in dev_watchdog()")
    Signed-off-by: Eric Dumazet <[email protected]>
    Reviewed-by: Willem de Bruijn <[email protected]>
    Reviewed-by: Toke Høiland-Jørgensen <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

@fmei-dm
Copy link

fmei-dm commented Feb 9, 2025

Created Kernel Bug Report: https://bugzilla.kernel.org/show_bug.cgi?id=219766

@lucsoft
Copy link

lucsoft commented Feb 17, 2025

@fmei-dm looks like they already have identified the issue https://lore.kernel.org/netdev/[email protected]/T/
there only looking into a good solution for it now.

also had a hetzner claim today, give them all the details and they accepted it.
for a precaution i also downgraded to 1.8.2 as you said above.

Hope it lands quickly and i can upgrade to 1.10 or something, or is there a easy way to be on modern version and keep the kernel version @smira?

@fmei-dm
Copy link

fmei-dm commented Feb 17, 2025

pretty straight forward, if you know how. The only tricky part was to find out which Kernel Package to use. After building, the image is pushed to the docker registry of your choice. You can then reference the InstallImage in talosctl upgrade. I made the docker Image Public - I don't know if it is possible in talos to pull from a password protected registry.

GITHUB_USER=xxxx
git clone https://github.com/siderolabs/talos.git
git checkout v1.8.4
docker login ghcr.io -u ${GITHUB_USER}
make installer INSTALLER_ARCH=amd64 PLATFORM=linux/amd64 PUSH=true USERNAME=${GITHUB_USER} PKG_KERNEL=ghcr.io/siderolabs/kernel:v1.8.0-23-g9aac1a8 PKG_KMOD=ghcr.io/siderolabs/kmod:v1.8.0-23-g9aac1a8 TAG_SUFFIX=oldkernel

@smira
Copy link
Member

smira commented Feb 17, 2025

Like I said above, Talos 1.9 is using Linux 6.12, and it's not clear for me whether it is affected or not

@fmei-dm
Copy link

fmei-dm commented Feb 17, 2025

Like I said above, Talos 1.9 is using Linux 6.12, and it's not clear for me whether it is affected or not

@smira m4xmorris already tested it. Same problem. See linked comment.

#9837 (comment)

@fmei-dm
Copy link

fmei-dm commented Feb 25, 2025

@smira it seems the bugfix will make it in kernel 6.14. Do you know in which Talos Version this kernel will be used? Is it possible to backport this little patch in kernel 6.12 of the official Talos Release?

@smira
Copy link
Member

smira commented Feb 25, 2025

@smira it seems the bugfix will make it in kernel 6.14. Do you know in which Talos Version this kernel will be used? Is it possible to backport this little patch in kernel 6.12 of the official Talos Release?

yes, as long as you have a commit ref to Linux main and a thread from lkml, we can backport it of course

@fmei-dm
Copy link

fmei-dm commented Feb 25, 2025

Awesome! Below you can find the requested links:

@fmei-dm
Copy link

fmei-dm commented Feb 25, 2025

PS: is it safe to upgrade from 1.8.2 to the latest 1.9 version? I ask because I can not upgrade to the latest minor because of this problem.

@smira
Copy link
Member

smira commented Feb 25, 2025

PS: is it safe to upgrade from 1.8.2 to the latest 1.9 version? I ask because I can not upgrade to the latest minor because of this problem.

yes

smira added a commit to smira/pkgs that referenced this issue Feb 25, 2025
See siderolabs/talos#9837

This causes invalid Ethernet packets to be sent out, which might
trigger unrelated issues in some environments.

See:

* git.kernel.org/netdev/net/c/0e4427f8f587
* lore.kernel.org/netdev/[email protected]/T

Signed-off-by: Andrey Smirnov <[email protected]>
@smira
Copy link
Member

smira commented Feb 25, 2025

Ok, the fix will be backported to the next 1.9.x release.

Thanks for finding the patch!

@smira smira closed this as completed Feb 25, 2025
@smira smira self-assigned this Feb 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants