Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOC] Update airflow migration #4991

Closed
wants to merge 10 commits into from
1 change: 1 addition & 0 deletions docs/getting_started_with_workflow_development/index.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
(getting_started_workflow_development)=

# Getting started with workflow development

Machine learning engineers, data engineers, and data analysts often represent the processes that consume, transform, and output data with directed acyclic graphs (DAGs). In this section, you will learn how to create a Flyte project to contain the workflow code that implements your DAG, as well as the configuration files needed to package the code to run on a local or remote Flyte cluster.
Expand Down
1 change: 1 addition & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,7 @@ Quickstart guide <quickstart_guide>
Getting started with workflow development <getting_started_with_workflow_development/index>
Flyte fundamentals <flyte_fundamentals/index>
Flyte agents <flyte_agents/index>
Migrating to Flyte <migrating_to_flyte/index>
Core use cases <core_use_cases/index>
```

Expand Down
17 changes: 17 additions & 0 deletions docs/migrating_to_flyte/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
(migrating_to_flyte)=
# Migrating to Flyte

```{list-table}
:header-rows: 0
:widths: 20 30

* - {doc}`Migrating from Airflow to Flyte <migrating_from_airflow_to_flyte>`
- Migrate your Airflow DAGs to Flyte with minimal effort.
```

```{toctree}
:maxdepth: -1
:hidden:

migrating_from_airflow_to_flyte
```
81 changes: 81 additions & 0 deletions docs/migrating_to_flyte/migrating_from_airflow_to_flyte.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
(migrating_from_airflow_to_flyte)=

# Migrating from Airflow to Flyte

Flyte can compile Airflow tasks into Flyte tasks without changing code, which allows you
to migrate your Airflow DAGs to Flyte with minimal effort.

In addition to migration capabilities, Flyte users can seamlessly integrate Airflow tasks into their workflows, leveraging the ecosystem of Airflow operators and sensors.
By combining the robust Airflow ecosystem with Flyte's capabilities such as caching, versioning, and reproducibility, users can run more complex data and machine learning workflows with ease.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
By combining the robust Airflow ecosystem with Flyte's capabilities such as caching, versioning, and reproducibility, users can run more complex data and machine learning workflows with ease.
By combining the robust Airflow ecosystem with Flyte's capabilities such as caching, versioning, and reproducibility, users can run more complex data and machine learning workflows with ease. For more information, see the [Airflow agent documentation](https://docs.flyte.org/en/latest/flytesnacks/examples/airflow_agent/index.html).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated it


## Prerequisites

- Install `flytekitplugins-airflow` in your Python environment.
- Enable an {ref}`Airflow agent<deployment-agent-setup-airflow>` in your Flyte cluster.

## Steps

### 1. Define your Airflow tasks in a Flyte workflow

Flytekit compiles Airflow tasks into Flyte tasks, so you can use
any Airflow sensor or operator in a Flyte workflow.


```python
from flytekit import task, workflow
from airflow.operators.bash import BashOperator

@task
def say_hello() -> str:
return "Hello, World!"

@workflow
def airflow_wf():
flyte_task = say_hello()
airflow_task = BashOperator(task_id=f"airflow_bash_operator", bash_command="echo hello")
airflow_task >> flyte_task

if __name__ == "__main__":
print(f"Running airflow_wf() {airflow_wf()}")
```

### 2. Test your workflow locally

:::{note}
Before running your workflow locally, you must configure the [Airflow connection](https://airflow.apache.org/docs/apache-airflow/stable/howto/connection.html) by setting the `AIRFLOW_CONN_{CONN_ID}` environment variable.
For example,
```bash
export AIRFLOW_CONN_MY_PROD_DATABASE='my-conn-type://login:password@host:port/schema?param1=val1&param2=val2'
```
:::

Although Airflow doesn't support local execution, you can run your workflow that contains Airflow tasks locally, which is helpful for testing and debugging your tasks before moving to production.

```bash
pyflyte run workflows.py airflow_wf
```

:::{warning}
Some Airflow operators may require certain permissions to execute. For instance, `DataprocCreateClusterOperator` requires the `dataproc.clusters.create` permission.
When running Airflow tasks locally, you may need to set the necessary permissions locally for the task to execute successfully.
:::

### 3. Move your workflow to production

:::{note}
In production, we recommend storing connections in a [secrets backend](https://airflow.apache.org/docs/apache-airflow/stable/security/secrets/secrets-backend/index.html).
Make sure the agent pod has the right permission (IAM role) to access the secret from the external secrets backend.
:::

After you have tested your workflow locally, you can execute it on a Flyte cluster using the `--remote` flag.
In this case, Flyte creates a pod in the Kubernetes cluster to run the `say_hello` task, and then runs
your Airflow `BashOperator` task on the Airflow agent.

```bash
pyflyte run --remote workflows.py airflow_wf
```

:::{note}
Many Airflow operators and sensors have been tested on Flyte, but some may not work as expected.
If you encounter any issues, please file an [issue](https://github.com/flyteorg/flyte/issues) or reach out to the Flyte community on [Slack](https://slack.flyte.org/).
:::
Loading