Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a way of working for passing secrets as arguments #139

Open
PhilippeMoussalli opened this issue May 16, 2023 · 4 comments
Open
Labels
enhancement New feature or request

Comments

@PhilippeMoussalli
Copy link
Contributor

We need to find a way to pass secret arguments in kubeflow as it might come in handy for different use cases such as service account key, hf hub key, ...

@RobbeSneyders
Copy link
Member

This might be runner dependent.

@RobbeSneyders RobbeSneyders added the enhancement New feature or request label Aug 29, 2023
@PhilippeMoussalli
Copy link
Contributor Author

This might be runner dependent.

Is it needed for the local runner? Seems to be more relevant for the remote one since multiple users can have access to the cluster and see the specs

@RobbeSneyders
Copy link
Member

Yes, I meant that we might have to use framework-specific features of the underlying execution framework. We'll have multiple remote runners in the future.

@PhilippeMoussalli
Copy link
Contributor Author

Proposal:

We make the process of retrieving the images cloud-dependent (e.g. in GCP we would use the secret manager). This is a bit similar to how prefect does it and also the recommended method by Vertex:

https://prefecthq.github.io/prefect-gcp/#using-prefect-with-google-secret-manager
https://prefecthq.github.io/prefect-aws/#using-prefect-with-aws-secrets-manager
https://cloud.google.com/vertex-ai/docs/pipelines/secret-manager

We can detect which cloud is used from the base path. At the componentOp we can enable adding secret_kwargs

Example for GCP

ComponentOp(secret_kwargs={project_id =..., secret_id =..., version_id =...)

in fondant we can provide an abstract class that detects the cloud provider and handles the method of retrieving the secret.

class AbstractSecretManager(ABC):
     @abstractmethod
     def get_secret():
     ....

class GCPSecretManager(AbstractSecretManager):
  ....
class AWSSecretManager(AbstractSecretManager):

In the executor, we would then need detect if secret_kwargs are passed and load the corresponding secret manager handler to fetch the secret(s) and pass them to the component. They should somehow be passed, become accessible at the component level for the user to handle. Maybe with a special method similar to the transform one but that handles secrets.

class Component(PandasTransformComponent):

    def __init__(self, ) -> None:
          self.secret = self.get_secret()
    
    def get_secret(secret):
          return secret 
    def transform(self, dataframe):

Only downside of this implementation is that the user will have to handle giving access to the pipeline SA access to the specific secret manager. Another issue can be the local runner since it can accept a local base path and the user SA may not have access to the secret manager. There we can maybe leverage handling secrets via env variables in docker compose. This may add some complexity in the executor though since we'll need to check if the secret comes from an env variables or if it should be fetched in case secret_kwargs are passed.

@RobbeSneyders RobbeSneyders moved this from Backlog to Breakdown in Fondant development Oct 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Breakdown
Development

No branches or pull requests

2 participants