You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We're leveraging Argo Workflows to orchestrate our pipeline, which results in each of the nodes being executed as an individual kedro run -n NODE invocation. With the vanilla setup of kedro-mlflow this results in a new run id for each of the nodes, which is highly undesirable.
Context
Being able to run large pipelines in a distributed manner
Possible Implementation
To overcome this limitation, we introduced an additional constraint that enforces uniqueness of the run name (code below). We've then implemented a hook:
If run-name is defined, verify run with name exists
If run exists, set the run id
If not exists, create run and set the run id
"""Kedro project hooks."""fromkedro.framework.hooksimporthook_implfrompysparkimportSparkConffrompyspark.sqlimportSparkSessionfromkedro.pipeline.nodeimportNodefromdatetimeimportdatetimefromtypingimportAnyimportpandasaspdimporttermplotlibastplfromomegaconfimportOmegaConfimportmlflowclassMLFlowHooks:
"""Kedro MLFlow hook. Mlflow supports the concept of run names, which are mapped to identifiers behind the curtains. However, this name is not required to be unique and hence multiple runs for the same name may exist. This plugin ensures run names are mapped to a single identifier. """@hook_impldefafter_context_created(self, context) ->None:
"""Initialise MLFlow run. Initialises a MLFlow run and passes it on for other hooks to consume. """cfg=OmegaConf.create(context.config_loader["mlflow"])
ifcfg.tracking.run.name:
# Set tracking urimlflow.set_tracking_uri(cfg.server.mlflow_tracking_uri)
experiment_id=self._create_experiment(cfg.tracking.experiment.name)
run_id=self._create_run(cfg.tracking.run.name, experiment_id)
# Update catalogOmegaConf.update(cfg, "tracking.run.id", run_id)
context.config_loader["mlflow"] =cfg@staticmethoddef_create_run(run_name: str, experiment_id: str) ->str:
"""Function to create run for given run_name. Args: run_name: name of the run experiment_id: id of the experiment Returns: Identifier of created run """# Retrieve runruns=mlflow.search_runs(
experiment_ids=[experiment_id],
filter_string=f"run_name='{run_name}'",
order_by=["start_time DESC"],
output_format="list",
)
ifnotruns:
withmlflow.start_run(
run_name=run_name, experiment_id=experiment_id
) asrun:
mlflow.set_tag("created_by", "kedro")
returnrun.info.run_idreturnruns[0].info.run_id@staticmethoddef_create_experiment(experiment_name: str) ->str:
"""Function to create experiment. Args: experiment_name: name of the experiment Returns: Identifier of experiment """experiments=mlflow.search_experiments(
filter_string=f"name = '{experiment_name}'"
)
ifnotexperiments:
returnmlflow.create_experiment(experiment_name)
returnexperiments[0].experiment_id
My suggestion would be to add a flag to the mlflow configuration, e.g..,
run:
id: nullname: "unique-run"stable_run_name: True # ensures writing to run with specified name if exists
Possible Alternatives
Supplying a static run-id is not possible, as this results in the a ResourceNotFoundError. The API is also limited in the sense that it is not possible to create a specific run-id.
The text was updated successfully, but these errors were encountered:
Hi, I understand the need for such a feature, and it would be a great addition.
However, mlflow does not let external orchestrator defines the run id on their own, and it seems really wrong to use the run name to do it, because by design it may not be unique. This would require a lot of custom logic on kedro-mlflow's side, and I am not sure this is the correct way to do it.
I think before rushing into an implementation we should investigate how people handle this for orchestrators like airflow, and eventually make such request directly in the mlflow repo. I d'ont close the issue because it's worth keeping track of this feature request, but I don't see it implemented as is.
Hi, I absolutely see you point. I think it's an ugly workaround, but I could not think of any other way. We use it on a daily basis now, and the RUN_NAME is injected directly from Argo Workflows, making it unique.
I do however think that the plugin should be able to allow a setup like this, otherwise this would render the plugin useless for pipelines that run distributed.
Description
We're leveraging Argo Workflows to orchestrate our pipeline, which results in each of the nodes being executed as an individual
kedro run -n NODE
invocation. With the vanilla setup ofkedro-mlflow
this results in a new run id for each of the nodes, which is highly undesirable.Context
Being able to run large pipelines in a distributed manner
Possible Implementation
To overcome this limitation, we introduced an additional constraint that enforces uniqueness of the
run name
(code below). We've then implemented a hook:run-name
is defined, verifyrun
with name existsrun
exists, set therun id
run id
My suggestion would be to add a flag to the
mlflow
configuration, e.g..,Possible Alternatives
Supplying a static
run-id
is not possible, as this results in the aResourceNotFoundError
. The API is also limited in the sense that it is not possible to create a specific run-id.The text was updated successfully, but these errors were encountered: