Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MLFlow conflict when using Databricks #610

Open
diegoliraQB opened this issue Nov 21, 2024 · 0 comments
Open

MLFlow conflict when using Databricks #610

diegoliraQB opened this issue Nov 21, 2024 · 0 comments

Comments

@diegoliraQB
Copy link

Description

When running kedro-mlflow on Databricks, occasionally a new run of the experiment might be triggered when running parallelized code. This is because Databricks enables autologging (at least in recent runtimes), and the new runs might be due to an mlflow bug.

Proposed solution: Add a new hook to disable autolog, or include it in the current hook.

class DisableMLFlowAutoLogger:    
    @hook_impl(tryfirst=True)
    def after_context_created(self, context) -> None:    
        mlflow.autolog(disable=True)

Although I encountered this because of Databricks, I can't imagine a context where you'd like to enable autolog together with the plugin. Could be a parameter of mlflow.yml if you want to be flexible.

Context

See conversation for context:
https://kedro-org.slack.com/archives/C03RKP2LW64/p1732141412790889

Steps to Reproduce

  1. Start a a Kedro pipeline using kedro-mlflow in a Databricks interactive notebook
  2. Use some parallelized code to trigger a new run. Minimal example with Optuna:
    study = optuna.create_study()
    study.optimize(lambda trial: objective(my_data,trial),n_trials=100,n_jobs=-1)

This will trigger maybe 4-6 new runs when using LightGBM in your objective.

Expected Result

Results should be in the run started by kedro-mlflow.

Actual Result

New runs are triggered.

Your Environment

Databricks Runtime 15.4 ML
Kedro 19.9
kedro-mlflow 0.13.3

Does the bug also happen with the last version on master?

Yes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🆕 New
Development

No branches or pull requests

1 participant