Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Datapipeline service in Google provider #53

Closed
wants to merge 1 commit into from

Conversation

VladaZakharova
Copy link
Owner

This PR:

  1. Removes datapipeline module from Google provider
  2. Moves all the related operators inside Dataflow service, based on the API that is used inside operators
  3. Adds new DataflowCreatePipeline, DataflowRunPipelineOperator and DataflowDeletePipelineOperator
  4. Adds new system test file for new operators
  5. Fixes documentation related to Dataflow, describing new operators for pipeline management

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@moiseenkov
Copy link
Collaborator

I'm afraid the community will consider this PR as a breaking change, because operators and hooks are deleted. If customers import them, after the google provider package upgrade they would get import errors.

Instead, I'd suggest:

  • deprecate modules (those that we want to delete in the future)
  • deprecate operators
  • deprecate hooks
  • make deprecated hooks empty and forward existing methods to the new hook. There are two possible ways to achieve that: (1) deprecated hook methods redirect calls to the new hook methods or if possible (2) re-inherit deprecated hooks from the new one, and delete all its methods - if they are the same, then the customer code should work with the old hook but under the hood, the new hook will work. I used this approach in Refactor GKE hooks apache/airflow#38404 - deprecated hooks became empty, but they inherit all needed methods.
  • each deprecation warning suggests the date when they will be removed according to our deprecation policy (6+ months from the current date, for example 01.12.24)
  • Airflow is intolerant to deprecation warnings, which means that their tests raise exceptions instead of warnings, that's why whenever it's inevitable to have these warnings in tests, we should add them to the ignore list: https://github.com/apache/airflow/blob/main/tests/deprecations_ignore.yml

Please ping me if you have questions, so I could help you.

@VladaZakharova VladaZakharova force-pushed the datapipeline-sys-test branch from 30ce297 to b109ac3 Compare May 20, 2024 11:14
@VladaZakharova
Copy link
Owner Author

I'm afraid the community will consider this PR as a breaking change, because operators and hooks are deleted. If customers import them, after the google provider package upgrade they would get import errors.

Instead, I'd suggest:

  • deprecate modules (those that we want to delete in the future)
  • deprecate operators
  • deprecate hooks
  • make deprecated hooks empty and forward existing methods to the new hook. There are two possible ways to achieve that: (1) deprecated hook methods redirect calls to the new hook methods or if possible (2) re-inherit deprecated hooks from the new one, and delete all its methods - if they are the same, then the customer code should work with the old hook but under the hood, the new hook will work. I used this approach in Refactor GKE hooks apache/airflow#38404 - deprecated hooks became empty, but they inherit all needed methods.
  • each deprecation warning suggests the date when they will be removed according to our deprecation policy (6+ months from the current date, for example 01.12.24)
  • Airflow is intolerant to deprecation warnings, which means that their tests raise exceptions instead of warnings, that's why whenever it's inevitable to have these warnings in tests, we should add them to the ignore list: https://github.com/apache/airflow/blob/main/tests/deprecations_ignore.yml

Please ping me if you have questions, so I could help you.

Thank you for your suggestions!
I have applied all the fixes, can you please take a look again?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants