Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prefixes stop working locally after a time #312

Open
PlagueCZ opened this issue Jun 27, 2023 · 1 comment
Open

Prefixes stop working locally after a time #312

PlagueCZ opened this issue Jun 27, 2023 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@PlagueCZ
Copy link
Contributor

In a setup where neighboring VMs have prefixes set-up and one can ping the other using an IP from the prefix range, this can stop working over time.

This happens when metalnet (not dp-service) gets restarted.

We narrowed it down to gRPC call init() that resets routes (and VNIs), which is a relatively new feature. This does not seem complete, because gRPC call to listPrefixes() actually does not get affected by this reset, thus metalnet's reconciliation does not happen and no prefixes are created.

I have created a simple demo of this fact (length of the prefix list is not affected by call to init) in test_initbug.py in fix/init_prefix_reset.

The actual test case of our setup is then provided by test_vf_to_vf.py::test_vf_to_vf_tcp_pfx, where the packet that should be sent to a VM gets sent to the router instead. Commenting-out lines 36-42 (init and route setup) will make the communication work as it should.

Florin "fixed" this in metalnet by first calling initialized() and only when that fails, calling init() (with subsequest initialized test), thus only ever calling init() once per dp-service lifetime, see this commit

@PlagueCZ PlagueCZ added the bug Something isn't working label Jun 27, 2023
@guvenc
Copy link
Collaborator

guvenc commented Jul 4, 2023

Correct. The workaround fix in metalnet is ok. When vnet-peering is implemented slightly differently, this init behaviour can be completely removed. (reverted back to original behavior) Therefore a fix to this code state may not be necessary. Leaving it open for the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants