Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Snap Strategy #306

Closed
wants to merge 10 commits into from
Closed

Conversation

gabrielcocenza
Copy link
Member

@gabrielcocenza gabrielcocenza commented Sep 6, 2024

DCGM and smartctl- will be installed via snap. This PR creates the SnapStrategy so we can install, remove and check as it happens with other strategies in the code base.

DCGM and smartctl- will be installed via snap. This PR creates the
SnapStrategy so we can install, remove and check as it happens
with other stratigies in the code base.
src/hw_tools.py Outdated Show resolved Hide resolved
Copy link
Contributor

@samuelallan72 samuelallan72 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I'm not sure how helpful it is to add this before adding any actual snaps, because this code doesn't have any real use yet, can't be functionally tested, and we don't have any client code that uses it (ie. we don't know yet if this api will be what we want).

It's tricky with figuring out when to write things and in what order, and I think I've gone back and forth over it. For this, my opinion is that this is a new feature where we should add support for an actual snap, not just developing an api in a vaccuum. :)

@gabrielcocenza
Copy link
Member Author

@samuelallan72

The DCGM and Smartctl snaps are not ready yet. This PR can be pending until those snaps are available. The idea here is just start having conversations about the approach and not wait until is ready for trying to figure out what to do.

src/hw_tools.py Outdated Show resolved Hide resolved
src/hw_tools.py Outdated
self._name = tool
self.snap_name = tool.value

def install(self, channel: str = "latest/stable") -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that channel should be part of init function, so the install is not needed. Otherwise, I do not see any benefit of even adding this strategy instead of directly using snap lib.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the channel is part of the configuration we get from the charm configuration?
Actually we need to know how to get the channel information when execute the install function. This PR need to include the changing of HWToolHelpe?

@gabrielcocenza Can you add more information about this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've included the channel at the init and had to change the HWToolHelper to be able to instantiate a class passing the channel

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, it's good for me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something was lost in the splitting of PRs, but this method still takes channel as an argument, and doesn't use self.channel.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't you need def install(self, channel) but not def __init__(self, channel)? if you are following the current design? https://github.com/canonical/hardware-observer-operator/blob/main/src/hw_tools.py#L182

src/hw_tools.py Outdated Show resolved Hide resolved
src/hw_tools.py Outdated
self._name = tool
self.snap_name = tool.value

def install(self, channel: str = "latest/stable") -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the channel is part of the configuration we get from the charm configuration?
Actually we need to know how to get the channel information when execute the install function. This PR need to include the changing of HWToolHelpe?

@gabrielcocenza Can you add more information about this?

@gabrielcocenza gabrielcocenza changed the title Add Snap Strategy Add Snap Strategy and DCGM snap Sep 11, 2024
src/service.py Outdated Show resolved Hide resolved
src/service.py Outdated Show resolved Hide resolved
src/hw_tools.py Outdated
snap.add(self.snap_name, channel=self.channel)
except snap.SnapError as err:
logger.error(
"Failed to install %s on channel: %s: %s", self.snap_name, self.channel, err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
"Failed to install %s on channel: %s: %s", self.snap_name, self.channel, err
"Failed to install %s from channel: %s: %s", self.snap_name, self.channel, err

src/hw_tools.py Outdated
"Failed to install %s on channel: %s: %s", self.snap_name, self.channel, err
)
else:
logger.info("Installed %s on channel: %s", self.snap_name, self.channel)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
logger.info("Installed %s on channel: %s", self.snap_name, self.channel)
logger.info("Installed %s from channel: %s", self.snap_name, self.channel)

src/hw_tools.py Outdated

def enable_services(self) -> None:
"""Enable the snap services."""
# breakpoint()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine this is a leftover from manual tests

src/hw_tools.py Outdated
@@ -695,14 +738,15 @@ def install(self, resources: Resources, hw_available: Set[HWTool]) -> Tuple[bool
if path:
strategy.install(path)
# APTStrategy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: needs updating

src/service.py Outdated Show resolved Hide resolved
src/service.py Outdated Show resolved Hide resolved
src/service.py Outdated Show resolved Hide resolved
src/service.py Outdated Show resolved Hide resolved
config.yaml Outdated Show resolved Hide resolved
src/config.py Outdated Show resolved Hide resolved
src/service.py Outdated Show resolved Hide resolved
tests/unit/test_charm.py Outdated Show resolved Hide resolved
@Pjack
Copy link

Pjack commented Sep 13, 2024

@gabrielcocenza Would you mind separating this PR into two parts? The first part is common to both the smartctl-exporter-snap and dcgm-exporter-snap, so it would be more convenient to work on them in parallel if we decouple the first and second parts. Thanks!

  1. SnapStrategy
  2. Integrate dcgm exporter snap

I propose to merge SnapStrategy first.

@gabrielcocenza gabrielcocenza changed the title Add Snap Strategy and DCGM snap Add Snap Strategy Sep 16, 2024
@gabrielcocenza
Copy link
Member Author

gabrielcocenza commented Sep 16, 2024

@samuelallan72 @aieri @rgildein Eric asked to split the SnapStrategy from the implementation of the dcgm exporter. I guess now there is a good idea on how it's going to be used and I'll open a PR soon on top of this one to implement the dcgm exporter considering those comments

src/hw_tools.py Outdated
self._name = tool
self.snap_name = tool.value

def install(self, channel: str = "latest/stable") -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something was lost in the splitting of PRs, but this method still takes channel as an argument, and doesn't use self.channel.

src/hw_tools.py Outdated
Comment on lines 208 to 211
def enable_services(self) -> None:
"""Enable the snap services."""
self.snap_client.start(list(self.snap_client.services.keys()), enable=True)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reckon this could be a private method (it's not used outside the strategy), or even better inlined because it's a single line method.

src/hw_tools.py Outdated
def install(self, channel: str = "latest/stable") -> None:
"""Install the snap from a channel."""
try:
snap.add(self.snap_name, channel=channel)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
snap.add(self.snap_name, channel=channel)
snap.add(self.snap_name, channel=self.channel)

Comment on lines +196 to +198
logger.error(
"Failed to install %s from channel: %s: %s", self.snap_name, self.channel, err
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logs an error to the charm log, but that's silent to the user - can we bubble this error up to the charm status? How do the other exporters handle installation failures?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good found Samuel!

Please follow the current error handling design here.

except (
ResourceFileSizeZeroError,
OSError,
apt.PackageError,
ResourceChecksumError,
) as e:
logger.warning("Strategy %s install fail: %s", strategy, e)
fail_strategies.append(strategy.name)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to re-raise the error. Thanks

src/hw_tools.py Outdated
self.channel = channel
self.snap_client = snap.SnapCache()[tool.value]

def install(self, channel: str = "latest/stable") -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def install(self, channel: str = "latest/stable") -> None:
def install(self) -> None:

Comment on lines +196 to +198
logger.error(
"Failed to install %s from channel: %s: %s", self.snap_name, self.channel, err
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good found Samuel!

Please follow the current error handling design here.

except (
ResourceFileSizeZeroError,
OSError,
apt.PackageError,
ResourceChecksumError,
) as e:
logger.warning("Strategy %s install fail: %s", strategy, e)
fail_strategies.append(strategy.name)

@@ -175,6 +176,44 @@ def remove(self) -> None:
# the remove option.


class SnapStrategy(StrategyABC):
Copy link
Contributor

@jneo8 jneo8 Sep 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion:

I kind of prefer code like this:

  • No init function
  • hard code HWTool
  • Pass channel on install function
class SnapStrategyABC(StrategyABC, metaclass=ABCMeta):
    """Snap strategy class."""

    def install(self, channel: str = "latest/stable") -> None:
        """Install the snap from a channel."""
        # define install details here.
        ...
        
class SmartCtlStrategy(SnapStrategyABC):
    _name = HWTool.SMARTCTL_EXPORTER
                     
class HWToolHelper:
    """Helper to install vendor's or hardware related tools."""

   def strategies(self) -> List[StrategyABC]:
        """Define strategies for every tools."""
        return [
            StorCLIStrategy(),
            RedFishStrategy(),
            ...
            # It's more easy to define strategy here
            SmartCtlStrategy(),
        ]
    # The channel information should only be given here. This make sure the object dependency is one-direction.
    def install(self, resources: Resources, channel: (some-type), hw_available: Set[HWTool]) -> Tuple[bool, str]:
        """Install tools."""
        
        ...
           
        for strategy in self.strategies:
            if strategy.name not in hw_available:
                continue
            try:
                if isinstance(strategy, TPRStrategyABC):
                    path = fetch_tools.get(strategy.name)  # pylint: disable=W0212
                    if path:
                        strategy.install(path)
                if isinstance(strategy, SnapStrategy):
                    # The channel should be an argument for install function
                    strategy.install(channel)
                elif isinstance(strategy, APTStrategyABC):
                    strategy.install()  # pylint: disable=E1120
                logger.info("Strategy %s install success", strategy)
            except (
                ResourceFileSizeZeroError,
                OSError,
                apt.PackageError,
                ResourceChecksumError,
                snap.SnapError,
            ) as e:
                logger.warning("Strategy %s install fail: %s", strategy, e)
                fail_strategies.append(strategy.name)

        if fail_strategies:
            return False, f"Fail strategies: {fail_strategies}"
        return True, ""

So the (dis)advantage are:

  • The strategy can keep in hard code, it's more easy to trace and keep the flexibility. (Imagine if you have different operation for DCGM & Smart).
  • Chance to use snap resource in the future
  • The only breaking change only happen on HWToolHelper.install function. Other design won't be broken.
    • Get the channel information in Charm class object and pass it to install function. This make sure the object dependency is one direaction.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Different snaps will have different channels to install from. I don't see how passing a single channel on HWToolHelper can solve this. I think it would be very strange to have one charm config for channel to rule on all snaps channels.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while I agree that we should remove the constructor for consistency with the other strategies, I don't quite get why it would be preferable to move complexity to the install function, as that requires the caller to have deeper knowledge about how each strategy works and adds extra conditionals to HWToolHelper.install

Copy link
Contributor

@jneo8 jneo8 Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a protocol to pass all the required material, resources, configuration, and hardware detect result, for the strategies.
So install function is the protocol right now. If we break this rule then the ISP(interface-segregation principles) will be broken then you have to find a tricky/hacky way to import those material into the strategies.

(I am open to any way to refactor the protocol we have now, but it's a refactor so it shouldn't happen on this PR)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Different snaps will have different channels to install from. I don't see how passing a single channel on ...

It's a pseudocode code, so pass the configuration you want.

Copy link
Contributor

@aieri aieri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, but the snap strategy should be aligned with the other strategies in terms of configuration being done by the client when calling install and not in the constructor (for consistency)

@@ -175,6 +176,44 @@ def remove(self) -> None:
# the remove option.


class SnapStrategy(StrategyABC):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while I agree that we should remove the constructor for consistency with the other strategies, I don't quite get why it would be preferable to move complexity to the install function, as that requires the caller to have deeper knowledge about how each strategy works and adds extra conditionals to HWToolHelper.install

@gabrielcocenza
Copy link
Member Author

while I agree that we should remove the constructor for consistency with the other strategies, I don't quite get why it would be preferable to move complexity to the install function, as that requires the caller to have deeper knowledge about how each strategy works and adds extra conditionals to HWToolHelper.install

I agree....

@aieri @jneo8 @samuelallan72 After putting more effort on how to integrate the dcgm snap I ended up noticing that we might not need to have a SnapStrategy for exporters that are snaped.

When we have debian packages or other binaries the Strategy is installing and removing the packages while the exporter on the service module is responsible to render the systemd configs and check the health of the daemons.

When we have a snap I'm not seeing the need to have this kind of separation and it's seems to me that it's simpler just handle everything as a service and we'll be able to do pretty much everything in just one module.

Please take a look at the draft PR that I've created at #319

Unfortunately this PR is showing to be hard to have the Generic snap class without a real implementation. In any case, I think this other PR has a cleaner approach

@gabrielcocenza
Copy link
Member Author

Closing this PR. The approach changed to not have a SnapStrategy and work directly with the exporter class itself. The new PR is at #319

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants