Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor cluster bringup. #180

Merged
merged 8 commits into from
Oct 4, 2023
Merged

Conversation

dceara
Copy link
Collaborator

@dceara dceara commented Sep 28, 2023

Define explicit nodes for all components (central DBs, relays, etc.), avoid IP guesswork and fix a few bugs.

It was confusing to have both and the majority of cases used
'cluster_cfg' instead of 'cluster_config' (100 vs 10).

Signed-off-by: Dumitru Ceara <[email protected]>
Storing a reference will have unexpected results as we might end up
sharing it between different nodes.

Signed-off-by: Dumitru Ceara <[email protected]>
The IPs were skewed by 2 for clustered DB deployments.  Instead of
mangling the connection string generation just allocate the correct IP.

Signed-off-by: Dumitru Ceara <[email protected]>
It removes the need to artificially compute remote values.  Instead just
use the correct RelayNode IPs.

Signed-off-by: Dumitru Ceara <[email protected]>
@dceara
Copy link
Collaborator Author

dceara commented Sep 28, 2023

CC: @LorenzoBianconi - this probably should go in before your PR #169 as it should simplify a bit multi-AZ deployments
CC: @fnordahl @mkalcok - this probably will conflict with your PR #179 but I think it's better to tackle these bringup issues early, before we add a new CMS.

It makes it easier to encapsulate things in the Cluster class.

Signed-off-by: Dumitru Ceara <[email protected]>
All nodes run the same OVSDB protocol.

Signed-off-by: Dumitru Ceara <[email protected]>
A cluster is a collection of central and worker nodes (+ relays).

Signed-off-by: Dumitru Ceara <[email protected]>
@dceara dceara force-pushed the explicit-nodes-refactor branch from 15ddc8b to 1380bba Compare September 28, 2023 23:04
@mkalcok
Copy link
Contributor

mkalcok commented Sep 29, 2023

Thanks for the heads-up @dceara . I'll rebase/fix #179 once this is merged.

@dceara
Copy link
Collaborator Author

dceara commented Sep 29, 2023

Thanks for the heads-up @dceara . I'll rebase/fix #179 once this is merged.

Thanks, if you have some time to review #180 that would also be awesome. :)

Copy link
Contributor

@mkalcok mkalcok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although I'm not expert on ovn-heater, I'd say that overall LGTM, nice changes.

@@ -56,7 +56,7 @@ class Node(ovn_sandbox.Sandbox):
def __init__(self, phys_node, container, mgmt_ip):
super().__init__(phys_node, container)
self.container = container
self.mgmt_ip = mgmt_ip
self.mgmt_ip = netaddr.IPAddress(mgmt_ip)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the concern here that other parts of the code might change mgmt_ip value? Because looking at the docs, the IPAddress object looks pretty immutable. (Just curious here)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's exactly it. I was doing something like that while trying out @LorenzoBianconi's ovn-ic changes, using a netaddr.IPAddress externally through iterate through node IPs and ending up changing all the ones that were already assigned.

@@ -147,6 +139,27 @@ def central_containers(self):
return self.db_containers


class RelayNode(Node):
def __init__(self, phys_node, container, mgmt_ip):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is slightly unrelated, but I was thinking whether there's a good way to improve semantics/logic of the container argument in the Node subclasses. All three WorkerNode, CentralNode and RelayNode accept container argument but it doesn't really affect anything. When creating these classes you have to match what's pretty much hardcoded. For the relay and central, prefix for the container names is set in ovn-fake-multinode script and for the worker nodes it can be set via prefix config option.

I think it'd be better if the logic of creating instances of these classes was entirely hidden from user and they'd be created automatically based on the values from test scenario and configs. Right now these classes give illusion of control but in reality user has to match names and counts defined somewhere else.

This is of course topic for entirely separate change, just something I was thinking about.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a very good point. I'll see if it's easy to handle with a new commit in this PR, otherwise I'll open an issue to follow up later, thanks!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I didn't get a chance to do this but I did open an issue to track it. I'll merge the PR as is to unblock other contributions. Thanks for the review, @mkalcok!

@dceara dceara merged commit eb0ab8e into ovn-org:main Oct 4, 2023
@dceara dceara deleted the explicit-nodes-refactor branch October 4, 2023 15:00
@fnordahl fnordahl mentioned this pull request Oct 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants