Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automl module deprecations #135

Closed

Conversation

olegkachur-e
Copy link
Collaborator

Deprecate gcp AutoML module

- Regarding the depreciation of AutoML API, deprecate the whole module.

- Suggested removal  date of September 30, 2025.

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@olegkachur-e olegkachur-e force-pushed the automl_module_deprecations branch from 78552ee to 9497e7c Compare December 3, 2024 00:00
kaxil and others added 7 commits December 3, 2024 15:47
This PR ports the overtime feature on `LocalTaskJob` (added in apache#39890) to the Supervisor.
It allows to terminate Task process to terminate when it exceeding the configured success overtime threshold which is useful when we add Listenener to the Task process.

closes apache#44356

Also added `TaskState` to update state and send end_date from task process to the supervisor.
* Add FilterParam and type_filter_param_factory

* Refactor Get Event Logs with filter_param_factory

* Refactor add type option for filter_param_factory

* Fix Get Event Logs with latest paginated_select

* Refactor Get Assets Event

* Refactor List Dag Warnings

* Refactor DagRun related

- QueryLastDagRunStateFilter
- dag_ids of get_dag_stats

* Remove unused parameters

* Refactor on Dag parameters

* Add any_equal to FilterParam

* Refactor Task Instance

* Fix Get Event Logs type

* Fix after rebase

* Refactor with search_param_factory

* Refactor QueryLastDagRunStateFilter

* Fix get_list_dag_runs_batch
…#44622)

We already issue a reminder to do it since it'll be faster, so we should
just tell folks to do it from the get go.
The goal here is to ensure behavioral parity w.r.t. sensor timeouts between deferrable and non-deferrable sensor operators.

With non-deferrable sensors, if there's a sensor timeout, the task fails without retry.  But currently, with deferrable sensors, that does not happen.

Since there's already a "timeout" capability on triggers, we can use this for sensor timeout.  Essentially all that was missing was the ability to distinguish between trigger timeouts and other trigger errors.  With this capability, base sensor can distinguish between the two, and reraise deferral timeouts as sensor timeouts.

So, here we add a new exception type, TaskDeferralTimeout, which base sensor reraises as AirflowSensorTimeout. Then, to take advantage of this feature, a sensor need only ensure that its timeout is passed when deferring. For convenience, we update the task deferred exception signature to take int and float in addition to timedelta, since that's how `timeout` attr is defined on base sensor.  But we do not change the exception attribute type.

In order to keep this PR focused, this PR only updates one sensor to use the timeout functionality, namely, time delta sensor.  Other sensors will have to be done as followups.
Head is confusing cus in git speak it means "what you have checked out".  That's not what we mean here.  Here we're trying to describe, most commonly, the branch in the repo that the user wants this bundle to track or "follow".  Branch would be a good name, but technically it could also be a tag, or even a commit hash.
@olegkachur-e olegkachur-e force-pushed the gcp_tanslate_models_operators_v branch from 46cbd1c to b15bcc8 Compare December 3, 2024 23:29
@olegkachur-e olegkachur-e force-pushed the gcp_tanslate_models_operators_v branch from b15bcc8 to 43e013c Compare December 4, 2024 11:25
…44642)

90% of these tests created a DagProcessorJobRunner with the Manager inside it,
then did absolutely nothing with the JobRunner object. This makes the tests
more directly use what they are testing.

(DagProcessorJobRunner itself is as simple as can be -- it calls `start()` ->
`terminate()` -> `end()` so we don't loose much of anything by not testing it
explicitly)
Copy link
Collaborator

@MaksYermak MaksYermak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
@moiseenkov what do you think?

* AIP-84 De-nest Tag Tags endpoint

* Fix CI
@moiseenkov
Copy link
Collaborator

It also looks good to me, but not all deprecations contain replacements. Can we provide them?

bbovenzi and others added 2 commits December 4, 2024 09:24
* Create dag graph with nested groups and join_ids

* move opengroup logic to a local context provider

* Add dag runs list, details, and failed runs button

* Refactor tabs to use custom styled NavLinks

* Remove note from table
@moiseenkov
Copy link
Collaborator

It also looks good to me, but not all deprecations contain replacements. Can we provide them?

If there is no replacement, then we should provide a reason at least.

vatsrahul1001 and others added 18 commits December 7, 2024 10:00
* remove weaviate deprecations

* update provider name in change log

* fix docs
* remove deprecations

* add changelog

* update provider name in changelog
* Random doc typos

* Update contributing-docs/testing/unit_tests.rst

Co-authored-by: Shahar Epstein <[email protected]>

* Update contributing-docs/testing/unit_tests.rst

---------

Co-authored-by: Shahar Epstein <[email protected]>
* Remove Provider Deprecations in Sqlite

* Adjust docs after deprecation
The test sometimes runs for a longer time and generates more
requests - thus producing slightly different output and count
of requests. This PR accepts bigger request count.
…ing it (apache#44766)

We had a bug hidden in our tests by our use of mocks -- if the subprocess
returned any output, then `self.selector.select()` would return straight away,
not waiting for the maximum timeout, which would result in the "escalation"
signal being sent after one output, not after the given interval.
… process (apache#44767)

has exited.

We noticed sometimes in CI that we would get 3 requests made, which
"shouldn't" happen, once it gets the 4xx error to the heartbeat it is meant to
kill the task process.

The vause of this was mostly an artifect of the short heartbeat interval we
used in the tests, and how we poll for the subprocess exit code. I don't think
it could have happened in practice (and it wouldn't affect anything if it did)
but I've made it more-robust anyway.
* Remove Provider Deprecations in Apprise

* Typo
* Remove deprecated code from pagerduty provider and update changelog

* modified changelog from pagerduty

* modified test_pagerduty.py

* modified import statement and fix static check

---------

Co-authored-by: pratiksha rajendrabhai badheka <pratiksha@DESKTOP-T5HUA05>
* remove deprecations

* updating change log

* fixing static checks
@olegkachur-e olegkachur-e force-pushed the automl_module_deprecations branch from 9497e7c to e18b9fc Compare December 8, 2024 00:32
potiuk and others added 9 commits December 8, 2024 17:12
…ache#44774)

This is extracted out of apache#44686 - pre-requisite for consistency
check and consistency change to always use version_compat embedded in
providers and avoid mistakes with importing the compat from tests
in the providers code.

This is purely extraction of constants that use to be in compat module
to version_compat - which will make it easy to write the pre-commit
to check if version_compat from tests_modules is used accidentally.
…44772)

When there is a test that does not allow pytest command to quit
cleanly, in case of parallell commands, we have no chance to see
the outputs of test command that failed, because whole CI job is
cancelled and we only upload the logs on failure in the following
step of the job.

Adding timeout for parallel tests that is a little shorter than
the job timeout will give a chance for our tests to get cancelled
before the job timeout occur, and even if we will not see the logs
in the output of the cancelled `breeze testing` command, the logs
should be uploaded as artifacts in this case.

Also we are serving "cancelled" status of job, because it's likely
that will also be possible to do "something" in case test gets
cancelled due to timeout.
…pache#44776)

The test_reading_from_pipes can sometimes return the logs in a
different order, because the order in which messages will be read
from stdout and stderr and values are put in the log are not
deterministic. Unfortunately this is comparing list of dicts and
dicts are not hashable, so we cannot use the usual trick of converting
the list to set. Instead we are using "pytest-unordered" library
that implements `unordered` helper to run such asserts.

The native pytest for unordered collection comparision is highly
requested but apparently stalled by maintainers. See
the pytest-dev/pytest#10032 issue.
* Removed deprecated code from apache beam provider

* Sphinx doc fix

* Deprecated DrillOperator

* Reverted non-related change
- Regarding the depreciation of AutoML API,
 deprecate the whole module content.
- Update documentation, regarding the deprecations.
- Suggested removal date of September 30, 2025.
@olegkachur-e olegkachur-e force-pushed the automl_module_deprecations branch from e18b9fc to 3e72bf1 Compare December 9, 2024 14:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.