-
Notifications
You must be signed in to change notification settings - Fork 666
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DOC] Update airflow migration #4991
Closed
Closed
Changes from 5 commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
ca5256e
move @pingsutw airflow migration doc from flytesnacks branch to flyte
ff76a03
copy edits
aacdee6
more copy edits
e4b6c67
merge master and fix conflict
54e4993
update airflow doc
pingsutw 734d82c
move airflow migration guide to development lifecycle section
ed0f5ed
wip
pingsutw 53516b9
update doc
pingsutw fb34760
Merge branch 'master' of github.com:flyteorg/flyte into docs/migratio…
pingsutw bfe3458
Merge branch 'docs/migration-guides' of github.com:flyteorg/flyte int…
pingsutw File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
(migrating_to_flyte)= | ||
# Migrating to Flyte | ||
|
||
```{list-table} | ||
:header-rows: 0 | ||
:widths: 20 30 | ||
|
||
* - {doc}`Migrating from Airflow to Flyte <migrating_from_airflow_to_flyte>` | ||
- Migrate your Airflow DAGs to Flyte with minimal effort. | ||
``` | ||
|
||
```{toctree} | ||
:maxdepth: -1 | ||
:hidden: | ||
|
||
migrating_from_airflow_to_flyte | ||
``` |
81 changes: 81 additions & 0 deletions
81
docs/migrating_to_flyte/migrating_from_airflow_to_flyte.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,81 @@ | ||
(migrating_from_airflow_to_flyte)= | ||
|
||
# Migrating from Airflow to Flyte | ||
|
||
Flyte can compile Airflow tasks into Flyte tasks without changing code, which allows you | ||
to migrate your Airflow DAGs to Flyte with minimal effort. | ||
|
||
In addition to migration capabilities, Flyte users can seamlessly integrate Airflow tasks into their workflows, leveraging the ecosystem of Airflow operators and sensors. | ||
By combining the robust Airflow ecosystem with Flyte's capabilities such as caching, versioning, and reproducibility, users can run more complex data and machine learning workflows with ease. | ||
|
||
## Prerequisites | ||
|
||
- Install `flytekitplugins-airflow` in your Python environment. | ||
- Enable an {ref}`Airflow agent<deployment-agent-setup-airflow>` in your Flyte cluster. | ||
|
||
## Steps | ||
|
||
### 1. Define your Airflow tasks in a Flyte workflow | ||
|
||
Flytekit compiles Airflow tasks into Flyte tasks, so you can use | ||
any Airflow sensor or operator in a Flyte workflow. | ||
|
||
|
||
```python | ||
from flytekit import task, workflow | ||
from airflow.operators.bash import BashOperator | ||
|
||
@task | ||
def say_hello() -> str: | ||
return "Hello, World!" | ||
|
||
@workflow | ||
def airflow_wf(): | ||
flyte_task = say_hello() | ||
airflow_task = BashOperator(task_id=f"airflow_bash_operator", bash_command="echo hello") | ||
airflow_task >> flyte_task | ||
|
||
if __name__ == "__main__": | ||
print(f"Running airflow_wf() {airflow_wf()}") | ||
``` | ||
|
||
### 2. Test your workflow locally | ||
|
||
:::{note} | ||
Before running your workflow locally, you must configure the [Airflow connection](https://airflow.apache.org/docs/apache-airflow/stable/howto/connection.html) by setting the `AIRFLOW_CONN_{CONN_ID}` environment variable. | ||
For example, | ||
```bash | ||
export AIRFLOW_CONN_MY_PROD_DATABASE='my-conn-type://login:password@host:port/schema?param1=val1¶m2=val2' | ||
``` | ||
::: | ||
|
||
Although Airflow doesn't support local execution, you can run your workflow that contains Airflow tasks locally, which is helpful for testing and debugging your tasks before moving to production. | ||
|
||
```bash | ||
pyflyte run workflows.py airflow_wf | ||
``` | ||
|
||
:::{warning} | ||
Some Airflow operators may require certain permissions to execute. For instance, `DataprocCreateClusterOperator` requires the `dataproc.clusters.create` permission. | ||
When running Airflow tasks locally, you may need to set the necessary permissions locally for the task to execute successfully. | ||
::: | ||
|
||
### 3. Move your workflow to production | ||
|
||
:::{note} | ||
In production, we recommend storing connections in a [secrets backend](https://airflow.apache.org/docs/apache-airflow/stable/security/secrets/secrets-backend/index.html). | ||
Make sure the agent pod has the right permission (IAM role) to access the secret from the external secrets backend. | ||
::: | ||
|
||
After you have tested your workflow locally, you can execute it on a Flyte cluster using the `--remote` flag. | ||
In this case, Flyte creates a pod in the Kubernetes cluster to run the `say_hello` task, and then runs | ||
your Airflow `BashOperator` task on the Airflow agent. | ||
|
||
```bash | ||
pyflyte run --remote workflows.py airflow_wf | ||
``` | ||
|
||
:::{note} | ||
Many Airflow operators and sensors have been tested on Flyte, but some may not work as expected. | ||
If you encounter any issues, please file an [issue](https://github.com/flyteorg/flyte/issues) or reach out to the Flyte community on [Slack](https://slack.flyte.org/). | ||
::: |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated it