Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Changing VPC NAT GW external network does not remove previous IP resource #4931

Open
cruickshankpg opened this issue Jan 16, 2025 · 2 comments
Labels
bug Something isn't working ipam vpc

Comments

@cruickshankpg
Copy link

Kube-OVN Version

v1.12.22

Kubernetes Version

v1.28.6

Operation-system/Kernel Version

"Ubuntu 22.04.5 LTS" 6.8.0-47-generic

Description

If a VPC NAT Gateway is reconfigured with a replacement externalSubnets entry, the GW pod is correctly recreated on the new external network but the IP resource that was created for the previous external network is not removed.

$k get vpc-nat-gateway  default-ext-gw
apiVersion: kubeovn.io/v1
kind: VpcNatGateway
metadata:
  creationTimestamp: "2025-01-03T12:00:31Z"
  generation: 8
  labels:
    ovn.kubernetes.io/subnet: default-sto-1-a
    ovn.kubernetes.io/vpc: default-sto-1
  name: default-ext-gw
  resourceVersion: "426985005"
  uid: 40915e78-8ca2-4e14-ba30-9c47fb1d26f4
spec:
  affinity: {}
  externalSubnets:
  - transit-network
  lanIp: 10.0.0.254
  subnet: default-sto-1-a
  vpc: default-sto-1

Getting IPs show's 3 associated with the gateway even though the ovn-vpc-external-network one is no longer in use by the gateway.

$k get ip | grep default-ext-gw
vpc-nat-gw-default-ext-gw-0.kube-system                                                           10.0.0.254               7a:c2:be:9c:da:5d   node-1     default-sto-1-a
vpc-nat-gw-default-ext-gw-0.kube-system.ovn-vpc-external-network.kube-system                      193.180.197.188                              node-1     ovn-vpc-external-network
vpc-nat-gw-default-ext-gw-0.kube-system.transit-network.kube-system.ovn                    172.16.0.5               8e:21:53:de:b2:de   node-1     transit-network

Manually deleting the dangling IP does free up the IP to be reallocated and there are no leaked network interfaces as far as I can tell.

Steps To Reproduce

Create VPC NAT gateway on external subnet

cat << EOF | k apply -f -
apiVersion: kubeovn.io/v1
kind: VpcNatGateway
metadata:
  name: default-ext-gw
spec:
  externalSubnets:
  - ovn-vpc-external-network
  lanIp: 10.0.0.254
  subnet: default-sto-1-a
  vpc: default-sto-1
EOF

Change gateway to use different external subnet

cat << EOF | k apply -f -
apiVersion: kubeovn.io/v1
kind: VpcNatGateway
metadata:
  name: default-ext-gw
spec:
  externalSubnets:
  - transit-network
  lanIp: 10.0.0.254
  subnet: default-sto-1-a
  vpc: default-sto-1
EOF

Current Behavior

Old external IP not cleared up when gateway moves networks.

Expected Behavior

External IP should be cleared up

@cruickshankpg cruickshankpg added the bug Something isn't working label Jan 16, 2025
Copy link

dosubot bot commented Jan 16, 2025

I found a similar closed issue titled "vpc nat gw pod external ip crd duplicate" which was resolved by a pull request that fixed the issue of multus IP not being deleted [1]. However, the specific changes made in the pull request are not detailed in the available information.

In your case, it seems that when the VPC NAT Gateway is reconfigured to use a different external subnet, the IP resource from the previous subnet is not being automatically removed. This behavior might be related to the issue mentioned above, but without the specific details of the fix, it's unclear if the same solution applies.

To address this, you might want to check if there are any updates or patches available for your current version (v1.12.22) that address this specific issue. Additionally, reviewing the cleanup logic in the codebase related to IP resource management when changing external subnets might provide insights into why the old IP is not being deleted.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@zbb88888
Copy link
Collaborator

could you help post some log about delete old iptables-eip in kube-ovn-controller logs.

maybe the root cause is iptables-eip need to clean in the nat gw pod, but the pod or the resource can not be found.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working ipam vpc
Projects
None yet
Development

No branches or pull requests

2 participants