Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Prevent concurrent execution #267

Open
2 of 13 tasks
art4ul opened this issue Apr 14, 2020 · 12 comments · Fixed by flyteorg/flytepropeller#326
Open
2 of 13 tasks

[Feature] Prevent concurrent execution #267

art4ul opened this issue Apr 14, 2020 · 12 comments · Fixed by flyteorg/flytepropeller#326
Labels
backlogged For internal use. Reserved for contributor team workflow. enhancement New feature or request exo

Comments

@art4ul
Copy link

art4ul commented Apr 14, 2020

Motivation: Why do you think this is important?
In my project I use scheduled launch plan to process a dataset each 1 hour. But sometimes the workflow may process the data more the 1 hour. In this case the scheduler creates the new concurrent execution of the launch plan. But the concurrent execution of the job is unacceptable for my use case (I get dublicated output).

Goal: What should the final outcome look like, ideally?
I think it would be nice to have an option for launch plan to allow concurrent execution or not.

Describe alternatives you've considered
The only one option fo me to start implementing some distributed locks using some external systems like Zookeeper

Flyte component

  • Overall
  • Flyte Setup and Installation scripts
  • Flyte Documentation
  • Flyte communication (slack/email etc)
  • FlytePropeller
  • FlyteIDL (Flyte specification language)
  • Flytekit (Python SDK)
  • FlyteAdmin (Control Plane service)
  • FlytePlugins
  • DataCatalog
  • FlyteStdlib (common libraries)
  • FlyteConsole (UI)
  • Other

Is this a blocker for you to adopt Flyte
Yes

@art4ul art4ul added enhancement New feature or request untriaged This issues has not yet been looked at by the Maintainers labels Apr 14, 2020
@kumare3
Copy link
Contributor

kumare3 commented Apr 15, 2020

Let me preface by saying that, the feature you are asking does make sense, its just that we need to think about how long should we keep waiting, is there a timeout?
@art4ul on the other hand, for your problem Ideally you want to prevent side-effects in Flyte Workflows. So why do you not lock the dataset that will be loaded by the execution every 1 hour, and thus the subsequent workflows will always work on independent units. This also gives you complete replayability.

Lets take an example, you have a dataset that materializes at some frequency.
datasets, d1( time x), d2( time x+delta)....
Your workflow runs every hour, at time t.

You could structure your workflow like this

Task1: lock-dataset(kickoff-time, execution_window="1 hour") -> []dataset_references
    collect_datasets for times (t - execution_window)
    return collected_datasets

Task2: process-dataset(refs []dataset_references)
     process-dataset(refs)

outputs...

Thus you can see, you are always locking the datasets so no matter when you run, but if you run with the right kickoff-time, you will end up with the right datasets to be processed

@art4ul
Copy link
Author

art4ul commented Apr 16, 2020

Thank you @kumare3 for reaching out!

Let me preface by saying that, the feature you are asking does make sense, its just that we need to think about how long should we keep waiting, is there a timeout?

I think we should not wait at all , we should prevent/skip the new execution if current is not complete.

@art4ul on the other hand, for your problem Ideally you want to prevent side-effects in Flyte Workflows. So why do you not lock the dataset that will be loaded by the execution every 1 hour, and thus the subsequent workflows will always work on independent units. This also gives you complete replayability.

Lets take an example, you have a dataset that materializes at some frequency.
datasets, d1( time x), d2( time x+delta)....
Your workflow runs every hour, at time t.

You could structure your workflow like this

Task1: lock-dataset(kickoff-time, execution_window="1 hour") -> []dataset_references
    collect_datasets for times (t - execution_window)
    return collected_datasets

Task2: process-dataset(refs []dataset_references)
     process-dataset(refs)

outputs...

Thus you can see, you are always locking the datasets so no matter when you run, but if you run with the right kickoff-time, you will end up with the right datasets to be processed

Definitely I could solve my issue by locking an input dataset during the processing time. But If I have a guarantee that my job will be scheduled in single instance(yes it's week guarantee ) I could avoid additional development and additional dependencies (Zookeeper or Consul) . I agree that using distributed locks provides stronger garantee of preventing double processing of the dataset. But in many cases we need to provide quick and simple solution, I beliave that the option "skip if running" would be very helpful that cases.
Sorry, but maybe I didn't catch your idea.

@kumare3
Copy link
Contributor

kumare3 commented Apr 19, 2020

@art4ul I guess I did not explain it well.

we should prevent/skip the new execution if current is not complete.

That is a good suggestion, So what we could start off with is in the Launch Plan when you attach a schedule, we could allow a scheduling policy, where an option is skip new executions if a previous instance of this schedule is still in progress

But If I have a guarantee that my job will be scheduled in single instance(yes it's week guarantee ) I could avoid
I think my example completely made it hard for you to understand. Sorry for the bad example.
By LOCK i mean, capture the datasets path completely as a closure.

So let us take a better example

`Assume a directory called dataset. 
dataset gets a new file every hour. So the the fully qualified paths for these files could look something like,
       dataset/t1, dataset/t2 ..... dataset/tn
Now, let us assume we want to run a scheduled job every 2 hours, that processes the files in the last 2 hours. Thus an implementation could run at any time and lookback for files in the last 2 hours using the names of the files or timestamps etc. But that may not work for late landing datasets and things like retries will affect the outcomes.
So another option could be,
Split the task into 2 tasks
Task1:
Looks at the directory and captures all the files that Task2 should work on as a list, it could be implemented as simply as listing the directory in reverse order of time and then filter all files older than time t = current time - 2 hours. This will give a locked set of files. Thus Task2, for subsequent retries etc will not use different set of files.

Task2:
Works on the input set of files`

@kumare3
Copy link
Contributor

kumare3 commented Apr 19, 2020

again that being said, we should definitely implement skipping, Can you help us implement this? It should be really simple to implement.
@katrogan could help you design the solution.

@kumare3 kumare3 removed the untriaged This issues has not yet been looked at by the Maintainers label Jun 7, 2021
@kumare3
Copy link
Contributor

kumare3 commented Jun 7, 2021

This is somewhat related to the feature that is covered by #872. @art4ul would this be enough, or you want complete serialization guarantee.
Also related to #420

Lets do a spec first and solve all of them together.

@kumare3
Copy link
Contributor

kumare3 commented Dec 2, 2021

This should not be closed

@kumare3 kumare3 reopened this Dec 2, 2021
@kumare3
Copy link
Contributor

kumare3 commented Dec 2, 2021

Cc @EngHabu

eapolinario pushed a commit to eapolinario/flyte that referenced this issue Dec 6, 2022
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Dec 6, 2022
…connections (flyteorg#268)

* Bump to pick up latest plugins (flyteorg#267)

Signed-off-by: Katrina Rogan <[email protected]>
Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Reflect node id change in upstream and downstream connections

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* wip

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* fix lint issue

Signed-off-by: Ketan Umare <[email protected]>
Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Fix connections for branch nodes

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Support mismatching interfaces for branches

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* only fail adding nodes in compiler transformer when nodes are different

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* remove replaceNodeID

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Fix unit tests

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Clean up commented code

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* bump DCO

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Cleanup

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Revert config.yaml

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* PR Comments

Signed-off-by: Haytham Abuelfutuh <[email protected]>

Co-authored-by: Katrina Rogan <[email protected]>
Co-authored-by: Ketan Umare <[email protected]>
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Dec 6, 2022
Signed-off-by: Flyte-Bot <[email protected]>

Co-authored-by: flyte-bot <[email protected]>
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Dec 6, 2022
* Minor grammar changes

Grammar fix
Changed Flyte IDL to Flyteidl
Updated index.rst files

Signed-off-by: SmritiSatyanV <[email protected]>
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Dec 20, 2022
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Dec 20, 2022
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Dec 20, 2022
* chore: feature flags support for paralel development
* Update Contributing.md with FeatureFlags usage
* Update README.md to include feature flags

Signed-off-by: Nastya Rusina <[email protected]>
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Dec 20, 2022
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Aug 9, 2023
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Aug 9, 2023
…connections (flyteorg#268)

* Bump to pick up latest plugins (flyteorg#267)

Signed-off-by: Katrina Rogan <[email protected]>
Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Reflect node id change in upstream and downstream connections

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* wip

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* fix lint issue

Signed-off-by: Ketan Umare <[email protected]>
Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Fix connections for branch nodes

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Support mismatching interfaces for branches

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* only fail adding nodes in compiler transformer when nodes are different

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* remove replaceNodeID

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Fix unit tests

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Clean up commented code

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* bump DCO

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Cleanup

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Revert config.yaml

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* PR Comments

Signed-off-by: Haytham Abuelfutuh <[email protected]>

Co-authored-by: Katrina Rogan <[email protected]>
Co-authored-by: Ketan Umare <[email protected]>
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Aug 21, 2023
Signed-off-by: Flyte-Bot <[email protected]>

Co-authored-by: flyte-bot <[email protected]>
eapolinario pushed a commit to eapolinario/flyte that referenced this issue Aug 21, 2023
@github-actions
Copy link

Hello 👋, This issue has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will close the issue if we detect no activity in the next 7 days. Thank you for your contribution and understanding! 🙏

@github-actions github-actions bot added the stale label Aug 26, 2023
@github-actions
Copy link

github-actions bot commented Sep 2, 2023

Hello 👋, This issue has been inactive for over 9 months and hasn't received any updates since it was marked as stale. We'll be closing this issue for now, but if you believe this issue is still relevant, please feel free to reopen it. Thank you for your contribution and understanding! 🙏

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 2, 2023
eapolinario pushed a commit that referenced this issue Sep 8, 2023
* Minor grammar changes

Grammar fix
Changed Flyte IDL to Flyteidl
Updated index.rst files

Signed-off-by: SmritiSatyanV <[email protected]>
eapolinario added a commit that referenced this issue Sep 8, 2023
* Add workflow state enums and update endpoint (#53)

* Add new phases to Task execution (#52)

* Add new phases to Task execution

* Update flyteidl python version (#54)

* Add state to NamedEntity (#55)

* Add filters to named entity list requests (#56)

* Add an ErrorKind field in ContainerError (#60)

* Add an ErrorKind field in ContainerError

* Bump versions

* Arbitrary container support without Flytekit using FlyteCoPilot (#62)

* Releasing v0.17.29 for RawContainerPlugin (#63)

* Add Auth to execution spec (#59)

* Flyte CoPilot now available for all containers (#64)

* Allow different modes of data download and upload in flyte co-pilot (#65)

* pytorch.proto and respective changes (#61)

* Add FailureHandlingStrategy to workflow idl (#67)

* Add FailureHandlingStrategy to workflow idl

* Rename and allow overrides per LaunchPlan and Execution

* Revert override changes

* revert

* Only keep options we will implement at the moment (#69)

* Only keep options we will implement at the moment

* regenerate

* feature; tfoperator for tensorflow distributed training plugin (#71)

* Add quality of service to launch plan/execution spec (#68)

* IDL changes for Node-Node relationship refactor (#72)

* IDL changes for Node-Node relationship refactor
* Ability to provide information of the parent node.
* Introduction of group_id to indicate grouping of nodes within the Parent node - (instead of dummy nodes)
* Support for querying all nodes under node with group.
* Support for node_id to point to original node in the workflow/graph, and node_name

* feature; catalog metadata and caching status published in node events (#70)

* Catalog store handling

* Event proto updated

* generated

* merged

* Dataset Identifier changed

* Better model

* Exposing TaskNodeMetadata with catalog information in the API

* 0.17.38

* Comment

* docs updated

* Adding proto definitions for supporting SageMaker TrainingJob (built-in algorithms) and HyperparameterTuningJob (#66)

* adding sagemaker protos

Co-authored-by: Haytham AbuelFutuh <[email protected]>

* Add output prefix to LaunchPlanSpec (#74)

* [ignore] Tweaking a couple comments (#75)

* Return full execution inputs & outputs (#73)

* [Autogenerated] Add labels to projects in flyteidl (#77)

* first cut

* bump version

* add to project.json

* regen

Co-authored-by: Konstantin Gizdarski <[email protected]>
Co-authored-by: Katrina Rogan <[email protected]>

* Introduce ProjectUpdate endpoint to Flyte Admin API (#78)

* add project update endpoint

* PR comments

* pr comments

* pr comments (last one ;p)

* spelling

* generate

Co-authored-by: Konstantin Gizdarski <[email protected]>
Co-authored-by: Katrina Rogan <[email protected]>

* Fix Stackdriver log link (#79)

This PR fixes Stackdriver log link using `resource.labels.pod_name` as `advancedFilter` instead of `logName`.

* Create PluginOverrides MatchableResource. (#80)

* Make plugin overrides repeated (#81)

* Add option to record whether a task ran on a spot instance. (#82)

* Support Styx style schedule (#83)

* Support cron schedule with offset.

For example: https://github.com/spotify/styx#schedule-string

* Add generated files

* Update protos/flyteidl/admin/schedule.proto

Co-authored-by: Yee Hing Tong <[email protected]>

* Update protos/flyteidl/admin/schedule.proto

Co-authored-by: Yee Hing Tong <[email protected]>

* Update protos/flyteidl/admin/schedule.proto

Co-authored-by: Yee Hing Tong <[email protected]>

* Regenerate files

* Regenerate files

* Change the message name back

* Revert "Change the message name back"

This reverts commit e468ea836dd30fc01699fd417ee64a182603b7f7.

* Add comments for schedule and offset

Co-authored-by: Yee Hing Tong <[email protected]>

* Adding necessary fields in training job message for distributed training  (#84)

* `distributed_protocol` added as a field in training job resource config

* Added state to project (#88)

Add state to a project

* Create conf.py

* Revert conf.py addition (#91)

* Add filter and sort param when listing projects (#90)

* Add filter and sort param when listing projects

* Bump version

* Deprecate Log Plugins (#94)

* Update CI post migration (#95)

* Migrate off of travis

* move proto test to actions

* wip

* wip

* wip

* wip

* install git

* wip

* wip

* update script

* try uploading

* wip

* wip

* add the rest of the steps

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* Update pull_request.yml

* Update pull_request.yml

* Update pull_request.yml

* Ignore generated code

* Update pull_request.yml

* Update pull_request.yml

* wip

* regenerate

* fix workdir

* wip

* wip

* wip

* wip

* wip

* fixing lint

* lint

* Add NPM Publish Step

* cleanup

* Update setup.py

* update maintainer emails

* Publish public npm package

* Publish NPM Package

* Docs for FlyteIDL (#97)

* Change Security/Permission fields with overrides (#98)

* Change Security/Permission fields with overrides

* Added changes from flyteplugins and also added functionality for FetchFromLiteral (#101)

* Added changes from flyteplugins

* Moved more code fromflyteplugins due to test dependencies

* Moved unit test for time primitive

* Added fetchFromliteral functionality

* Fixing lint issues

* Minor lint fixes

* Moved Fetch literal functionality to separate file

* Incorporated the feedback

Co-authored-by: pmahindrakar <[email protected]>

* Add task version to template (#102)

* Update ArrayJob for Map task changes (#103)

* Update Organization. Lyft -> Flyteorg, use k8s libraries (#109)

* Revamp Security context to be more specific and expose it to task templates (#108)

* Revamp Security Context Message

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Security Context in Admin entities

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* fix tabs

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* add deprecated message

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Update protos/flyteidl/core/types.proto

Co-authored-by: Ketan Umare <[email protected]>
Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Update protos/flyteidl/core/types.proto

Co-authored-by: Ketan Umare <[email protected]>
Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Move to security proto

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* comment

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* PR Comments

Signed-off-by: Haytham Abuelfutuh <[email protected]>

Co-authored-by: Ketan Umare <[email protected]>

* Added new rpc service GetVersion (#110)

* Added new rpc service GetVersion

Signed-off-by: Yuvraj <[email protected]>

* More changes

Signed-off-by: Yuvraj <[email protected]>

* Update get version description

Signed-off-by: Yuvraj <[email protected]>

* Version v0.18.16 (#111)

* Version v0.18.16

Signed-off-by: Ketan Umare <[email protected]>

* Update GetVersion API to match the other API style

 - Have an additional envelop structure

Signed-off-by: Ketan Umare <[email protected]>

* Move proto from datacatalog (#113)

* Move proto from datacatalog

Signed-off-by: niant <[email protected]>

* add generated files

Signed-off-by: niant <[email protected]>

Co-authored-by: niant <[email protected]>

* Add free form map config for task custom (#112)

* Add Secret Group (#114)

* Add Secret Group

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* PR Comments

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Automatically update flyteidl versions when a github release is created (#115)

* wip: Added version automatically in setup.py

Signed-off-by: yuvraj <[email protected]>

* wip: small fix

Signed-off-by: yuvraj <[email protected]>

* wip: small fix

Signed-off-by: yuvraj <[email protected]>

* wip: added npm version update

Signed-off-by: Yuvraj <[email protected]>

* wip: added workflow for master merge

Signed-off-by: Yuvraj <[email protected]>

* wip: added goreleaser for creating release

Signed-off-by: Yuvraj <[email protected]>

* Revert Getversion method from put to get (#116)

* Revert Getversion method from put to get

Signed-off-by: Yuvraj <[email protected]>

* Added version in api url

Signed-off-by: Yuvraj <[email protected]>

* Remove need condition in github action (#117)

Signed-off-by: yuvraj <[email protected]>

* npm auto release fix (#118)

Signed-off-by: Yuvraj <[email protected]>

Thank you!

* Deprecate log plugins (#123)

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Literal Transformations from a string or an interface based on literalTypes (#120)

* wip: more transformations

Signed-off-by: Ketan Umare <[email protected]>

* Changed the String formatter from %s to %v and added more unit tests (#122)

Signed-off-by: pmahindrakar-oss <[email protected]>

Co-authored-by: pmahindrakar-oss <[email protected]>
Signed-off-by: Ketan Umare <[email protected]>

* Added default condition for unsupported type while creating literal (#124)

* Changed the String formatter from %s to %v and added more unit tests

Signed-off-by: pmahindrakar-oss <[email protected]>

* Added unsupported type default condition

Signed-off-by: pmahindrakar-oss <[email protected]>

Co-authored-by: pmahindrakar-oss <[email protected]>
Signed-off-by: pmahindrakar-oss <[email protected]>

Co-authored-by: pmahindrakar-oss <[email protected]>
Co-authored-by: pmahindrakar-oss <[email protected]>

* #patch: Add in task execution event fields (#119)

* Add group_version to secrets (#125)

* Add group_version to secrets

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* PR Comments

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Fix typo in external resource message (#126)

* Fix field id for ExecutionSpec#auth_role (#127)

Changing field id wasn't intentional

Co-Authored-By: Fernando Diaz <[email protected]>

Signed-off-by: Gleb Kanterov <[email protected]>

* update docs theme (#133)

* update docs theme

Signed-off-by: cosmicBboy <[email protected]>

* add community link

Signed-off-by: cosmicBboy <[email protected]>

* [go] Add blob support for MakeLiteralForType (#131)

The MakeLiteralForType client method currently doesn't support the blob type. This adds support for that.

Signed-off-by: Sam Lai <[email protected]>

* dark theme updates (#134)

Signed-off-by: cosmicBboy <[email protected]>

* Added workflow for updating flyteidl version in all component (#132)

* Added workflow for updating flyte comopnent version

Signed-off-by: yuvraj <[email protected]>

* Added a single workflow for release flyteidl

Signed-off-by: yuvraj <[email protected]>

* revert workflow changes

Signed-off-by: yuvraj <[email protected]>

* bug fix in ci (#135)

Signed-off-by: yuvraj <[email protected]>

* Enable flyteidl update version for all component (#136)

* enable flyteidl update version for all component

Signed-off-by: yuvraj <[email protected]>

* Added more component

Signed-off-by: yuvraj <[email protected]>

* remove flytekit from component

Signed-off-by: yuvraj <[email protected]>

* Pass dynamic workflow closure in node execution events (#141)

* Documentation for Datacatalog (#143)

Datacatalog service documentation

Signed-off-by: Ketan Umare <[email protected]>

* Fix Datacatalog title.rst in docs (#145)

* missing blank line in title.rst

Signed-off-by: Ketan Umare <[email protected]>

* generated docs

Signed-off-by: Ketan Umare <[email protected]>

* Enable release automation for flytectl (#144)

Signed-off-by: yuvraj <[email protected]>

* Add GetOrReserveArtifact and ExtendReservation API (#142)

* Move dynamic workflow metadata to GetNodeExecutionData response (#151)

* Add DYNAMIC_RUNNING node execution phase (#152)

Signed-off-by: Katrina Rogan <[email protected]>

* Introduce Auth Metadata and Identity Grpc Service and support Pkce auth in admin client (#155)

* Introduce Auth Metadata and Identity Grpc Service

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* merge conflicts

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Update deps

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* cleanup deps

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* update deps

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Delete unused catalog client

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Remove the need for UseAuth config to simplify setup further

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Add deprecated comment

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Allow config of insecure creds transmission

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Moved the 3legged auth files from flytectl and also added config option for it (#156)

* Moved the 3legged auth files from flytectl and also added config option for it

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Fixed the expiry bug

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Changed logic to refresh the token

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Refactored the getAuthenticationDialOption

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Added more unit tests

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Fixed unit tests

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Added more unit tests

Signed-off-by: Prafulla Mahindrakar <[email protected]>
Signed-off-by: Haytham Abuelfutuh <[email protected]>

* refactor pkce package

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* rename

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* bump for dco

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Added ClientSetBuilder inorder to remove keyring dependency (#158)

* Added ClientSetBuilder inorder to remove keyring dependency

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Added token cache to deprecated method too

Signed-off-by: Prafulla Mahindrakar <[email protected]>
Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Refactoring latest changes

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* rename

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* do not close server right away

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* don't close http server too soon

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* event init error

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Cleanup

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* comments

Signed-off-by: Haytham Abuelfutuh <[email protected]>

Co-authored-by: pmahindrakar-oss <[email protected]>

* documentation revamp according to RFC (#160)

Signed-off-by: cosmicBboy <[email protected]>

* Update README.rst

* Update validate module to protoc-gen-validate 0.6.1 (#165)

* Update validate module to protoc-gen-validate 0.6.1

Signed-off-by: Hans Werner <[email protected]>

* include generated output

Signed-off-by: Hans Werner <[email protected]>

* update code

Signed-off-by: Hans Werner <[email protected]>

Co-authored-by: Julio Capote <[email protected]>

* Added flyteidl documentation (#167)

* Added flyteidl documentation

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Generated doc requirements and updated CI to not generate docs for now

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Moved to rst file for events

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Moved to rst file for all index files

Signed-off-by: Prafulla Mahindrakar <[email protected]>
(cherry picked from commit 6dc3c4785dc95b05412a9bec2377016cf16b5ea8)

* Added tmp doc generation step for getting doc dependencies

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Using RST for documentation generation (#169)

* Fix the links in generated html files

* Added handler for build finished event to fix doc links

* Added contributing guide

* Adding wait for the subprocess

* Fixed the path issue with finding the script for fixing the links

* Added darwin switch for sed -i flag

* Fixed non darwin with -i flag for sed

* Using find instead of xargs as it fails on readthedocs

* Moving the substitution errors to /dev/null

* Escaped the Link values

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Using protoc-gen-doc plugin directly

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Removed the example message

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Fixed datacatalog file generation

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Using template to generate docs and also removed unused imports in proto files and fixed doc issues

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Minor doc fix

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Added dependency to protoc-gen in biolerplate module

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* add main nav font awesome icons (#170)

* add main nav font awesome icons

Signed-off-by: cosmicBboy <[email protected]>

* add "edit on github button" config

Signed-off-by: cosmicBboy <[email protected]>

* fix html theme config

Signed-off-by: cosmicBboy <[email protected]>

* Support annotations/labels in sidecar (#171)

* Add K8sPod as a task target option (#172)

* Support passing an iam role AND a k8s service account (#173)

* Support passing an iam role AND a k8s service account (again) (#175)

Signed-off-by: Katrina Rogan <[email protected]>

* Added boilerplate automation (#162)

* Added boilerplate automation

Signed-off-by: Yuvraj <[email protected]>

* More changes

Signed-off-by: Yuvraj <[email protected]>

* Using protoc-gen-doc from the flyteorg and regenerated docs for new modules (#174)

* Fix the links in generated html files

* Added handler for build finished event to fix doc links

* Added contributing guide

* Adding wait for the subprocess

* Fixed the path issue with finding the script for fixing the links

* Added darwin switch for sed -i flag

* Fixed non darwin with -i flag for sed

* Using find instead of xargs as it fails on readthedocs

* Moving the substitution errors to /dev/null

* Escaped the Link values

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Using protoc-gen-doc plugin directly

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Removed the example message

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Fixed datacatalog file generation

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Using template to generate docs and also removed unused imports in proto files and fixed doc issues

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Minor doc fix

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Added dependency to protoc-gen in biolerplate module

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Pulled the latest boilerplate changes

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Added support for Generic type for extracting a literal value (#176)

* Added support for Generic type for extracting a literal value

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Added grpc gateway jsonpb marshaller and unmarshaller for struct type

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Using json.Marshal

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Incorporated feedback and added another test for string generic

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Add workflow execution config matchable attribute (#182)

* #minor Enum type (#183)

* Enum type definition

Signed-off-by: Ketan Umare <[email protected]>

* Enum Type support in flyteidl

Signed-off-by: Ketan Umare <[email protected]>

* lint fixes

Signed-off-by: Ketan Umare <[email protected]>

* re-generated

Signed-off-by: Ketan Umare <[email protected]>

* Enum default values are the first value in the list

Signed-off-by: Ketan Umare <[email protected]>

* Added missing docs and included boiler plate update as prereq for generate (#185)

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Add max concurrency to launch plan & execution spec (#186)

* MakeLiteralForType shouldn't print <nil> for nil values (#188)

* Add max concurrency to launch plan & execution spec (#186)

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* MakeLiteralForType shouldn't print <nil> for nil values

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* bump

Signed-off-by: Haytham Abuelfutuh <[email protected]>

Co-authored-by: Katrina Rogan <[email protected]>

* Clean up flyteadmin proto definitions and comments. (#189)

* Add Task Node overrides (#190)

* Update Boilerplate (#181)

Signed-off-by: Flyte-Bot <[email protected]>

Co-authored-by: flyte-bot <[email protected]>

* Bringin back README.md #patch (#192)

* Bringin back README.md

Signed-off-by: Ketan Umare <[email protected]>

* Moved contents of developing.rst to README (#193)

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Replaced doc_requirements.in with doc-requirements.in

Signed-off-by: Prafulla Mahindrakar <[email protected]>

Co-authored-by: pmahindrakar-oss <[email protected]>

* Added InsecureSkipVerify flag (#191)

* add deployment to the top TOC (#194)

Signed-off-by: Ketan Umare <[email protected]>

* algolia search (#195)

Signed-off-by: Samhita Alla <[email protected]>

* fix TOC (#196)

Signed-off-by: Samhita Alla <[email protected]>

* Add core entity links to admin request structures (#197)

* Fix typos in message comments (#199)


Signed-off-by: Sean Lin <[email protected]>

* Add endpoint to recover workflow execution (#187)

* update doc requirements with sphinx v4 (#201)

Signed-off-by: cosmicBboy <[email protected]>

* Update code of conduct (#200)

* update code of conduct

Signed-off-by: Samhita Alla <[email protected]>

* boilerplate

Signed-off-by: Samhita Alla <[email protected]>

* Handling large integers (#202)

* Handling large integers

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Add ephemeral storage as a container resource (#203)

Signed-off-by: Katrina Rogan <[email protected]>

* Add ephemeral storage as a matchable resource too (#204)

Signed-off-by: Katrina Rogan <[email protected]>

* Add Raw Output Metadata to events (#205)

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Disable bump-version and goreleaser workflow in the forked repository (#209)

* Update master.yml

Signed-off-by: Kevin Su <[email protected]>

* Update master.yml

Signed-off-by: Kevin Su <[email protected]>

* Update master.yml

Signed-off-by: Kevin Su <[email protected]>

* Add raw output data to admin execution models (#210)

Signed-off-by: Katrina Rogan <[email protected]>

* Update Boilerplate (#208)

Signed-off-by: Flyte-Bot <[email protected]>

Co-authored-by: flyte-bot <[email protected]>

* Fix up comment for raw output data (#211)

Signed-off-by: Katrina Rogan <[email protected]>

* readd docs (#212)

Signed-off-by: Katrina Rogan <[email protected]>

* Change variable map to repeated map entries for ordering (#206)

Signed-off-by: Sean Lin <[email protected]>

* Add Sql in target proto (#214)


Signed-off-by: Kevin Su <[email protected]>

* fix timestamp and duration literal extraction (#218)

* fixed literal extraction for timestamp and duration types - was previously returning protobufs instead of native golang types

Signed-off-by: Daniel Rammer <[email protected]>

* fixed lint error with imports being non-alphabetical

Signed-off-by: Daniel Rammer <[email protected]>

* Revert variable ordering change PR (#220)

* Revert "Change variable map to repeated map entries for ordering (#206)"

Signed-off-by: Sean Lin <[email protected]>

* Update EventError with Is and Unwrap functions (#221)

* update EventError with Is and Unwrap functions for use in errors API

Signed-off-by: Daniel Rammer <[email protected]>

* changed EventError Is* functionality to check entire error stack

Signed-off-by: Daniel Rammer <[email protected]>

* added nil tests to EventError.Is function

Signed-off-by: Daniel Rammer <[email protected]>

* Schema type support for flytectl (#219)

* MPI Operator plugin interface (#217)

* Added mpi plugin

Signed-off-by: Yuvraj <[email protected]>

* Rename variable name

Signed-off-by: Yuvraj <[email protected]>

* Added docs for mpi

Signed-off-by: Yuvraj <[email protected]>

* Support external token source for flyteadmin clients (#222)

* refactor: add token source provider interface
and implementation for clientcredentials and pkce

Signed-off-by: Babis Kiosidis <[email protected]>

* remove unused getAuthType method

Signed-off-by: Babis Kiosidis <[email protected]>

* refactorings

Signed-off-by: Babis Kiosidis <[email protected]>

* add TokenSourceProvider interface documentation

Signed-off-by: Babis Kiosidis <[email protected]>

* move token source provider logic to admin to avoid circular deps

Signed-off-by: Babis Kiosidis <[email protected]>

* fix authentication dial option tests

Signed-off-by: Babis Kiosidis <[email protected]>

* lint files

Signed-off-by: Babis Kiosidis <[email protected]>

* allow external auth process

Signed-off-by: Babis Kiosidis <[email protected]>

* dont forget to trim

Signed-off-by: Babis Kiosidis <[email protected]>

* refactor: change if to switch

Signed-off-by: Babis Kiosidis <[email protected]>

* refactor: rename AuthType value and config

Signed-off-by: Babis Kiosidis <[email protected]>

* refactor: move credentials TS Provider with other types

Signed-off-by: Babis Kiosidis <[email protected]>

Co-authored-by: Babis Kiosidis <[email protected]>

* Add Slack button to README (#224)

* Add Slack button to README

Signed-off-by: Samhita Alla <[email protected]>

* update slack link

Signed-off-by: Samhita Alla <[email protected]>

* Add Architecture field to Container (#226)

* Add Architecture field to Container

Signed-off-by: Anmol Khurana <[email protected]>

* Update enum

Signed-off-by: Anmol Khurana <[email protected]>

* Update enum

Signed-off-by: Anmol Khurana <[email protected]>

* remove events package (#225)

* moved 'go generate' snippet so that when events is removed admin service mocks are still generated

Signed-off-by: Daniel Rammer <[email protected]>

* removed events package

Signed-off-by: Daniel Rammer <[email protected]>

* Adding grpc health service in the clientset (#228)

* Adding grpc health service in the clientset

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Added unit tests

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Update Boilerplate (#229)

Signed-off-by: Flyte-Bot <[email protected]>

Co-authored-by: flyte-bot <[email protected]>

* dummy edit -- updated requirements.txt (#223)

Signed-off-by: Samhita Alla <[email protected]>

* Cache Serialize API (#215)

* removed datacatalog ExtendReservation API

Signed-off-by: Daniel Rammer <[email protected]>

* filled out generated content with 'make generate'

Signed-off-by: Daniel Rammer <[email protected]>

* updated protobuf docs for admin and core definitions

Signed-off-by: Daniel Rammer <[email protected]>

* created ReservationID message to reuse

Signed-off-by: Daniel Rammer <[email protected]>

* changed artifact reservation API to not include handling of artifacts - only reservations.

Signed-off-by: Daniel Rammer <[email protected]>

* added cache reservation comments

Signed-off-by: Daniel Rammer <[email protected]>

* fixed indenting issues that always drive me crazy

Signed-off-by: Daniel Rammer <[email protected]>

* updated discovery_reservable to discovery_serializable to adhere to name change

Signed-off-by: Daniel Rammer <[email protected]>

* moved catalog reservation status enum into a message

Signed-off-by: Daniel Rammer <[email protected]>

* changed discovery_serializable TaskMetadata field to cache_serializable

Signed-off-by: Daniel Rammer <[email protected]>

* Execution closures in the API should defer to GetData calls for passing output data (#234)

Signed-off-by: Katrina Rogan <[email protected]>

* Update Boilerplate (#236)

Signed-off-by: Flyte-Bot <[email protected]>

Co-authored-by: flyte-bot <[email protected]>

* Added flag to pass in CAcerts (#242)

* Added flag to pass in CAcerts

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Incorporated feedback

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* caCert instead of caCerts

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* caCert to caCertFilePath

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Add incompatible cluster event error (#240)

* Add aborting workflow execution phase (#245)

* New StructuredDataset type/literal #patch (#227)

Signed-off-by: Yee Hing Tong <[email protected]>

* bump docsearch version (#246)

Signed-off-by: Samhita Alla <[email protected]>

* update flyteidl for new navbar (#247)

* update furo dep git > https (#249)

Signed-off-by: Niels Bantilan <[email protected]>

* Update Boilerplate (#243)

Signed-off-by: Flyte-Bot <[email protected]>

Co-authored-by: flyte-bot <[email protected]>

* Add RetryAttempt and Phase to ExternalResourceInfo proto (#244)

* added retry attempt and phase to ExternalResourceInfo proto

Signed-off-by: Daniel Rammer <[email protected]>

* captialized ExternalResourceInfo retry_attempt comment

Signed-off-by: Daniel Rammer <[email protected]>

* added index to ExternalResourceInfo proto

Signed-off-by: Daniel Rammer <[email protected]>

* regenerated protos to fix rebase issues

Signed-off-by: Daniel Rammer <[email protected]>

* add sphinx panels to docs (#250)

Signed-off-by: Niels Bantilan <[email protected]>

* Added UpdateExecution API and api objects (#248)

* Expose flyteadmin client's auth opt (#252)

* Expose flyteadmin client's auth opt
Signed-off-by: Sean Lin <[email protected]>

* Move authOpt to global
Signed-off-by: Sean Lin <[email protected]>

* minor fix
Signed-off-by: Sean Lin <[email protected]>

* Introduce TypeAnnotation#minor (#232)

* feat: introduce TypeAnnotation

Signed-off-by: Kenny Workman <[email protected]>

* fix: message description

Co-authored-by: Haytham Abuelfutuh <[email protected]>
Signed-off-by: Kenny Workman <[email protected]>

* fix: regen proto after merge

Signed-off-by: Kenny Workman <[email protected]>

Co-authored-by: Haytham Abuelfutuh <[email protected]>

* Remove AdminAuth Client global state (#254)

* Remove AdminAuth Client global state

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Fix unit test

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Added isDynamic flag for distinguishing subworkflows and dynamic workflows (#256)

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Add missing Python libs (#258)

Signed-off-by: Yee Hing Tong <[email protected]>

* Add hash to literal #minor (#237)

* Add hash to Literal

Signed-off-by: Eduardo Apolinario <[email protected]>

* make generate

Signed-off-by: Eduardo Apolinario <[email protected]>

* Comment hash

Signed-off-by: Eduardo Apolinario <[email protected]>

Co-authored-by: Eduardo Apolinario <[email protected]>

* Added RawOutputDataConfig in ExecutionSpec (#260)

* Added RawOutputDataConfig in ExecutionSpec

Signed-off-by: Kevin Su <[email protected]>

* bring back missing rsts

Signed-off-by: Yee Hing Tong <[email protected]>

* update comment, remove @latest, regenerate

Signed-off-by: Yee Hing Tong <[email protected]>

* die @latest

Signed-off-by: Yee Hing Tong <[email protected]>

Co-authored-by: Yee Hing Tong <[email protected]>

* Union Types #minor (#235)

* Add support union type

Signed-off-by: Kevin Su <[email protected]>

* Update union type + add union literal repr

* Update union types to use string tags

* Fix typo + generate protos

* Implement changed design

* generate

* Remove changes to download_tooling.sh

Signed-off-by: Eduardo Apolinario <[email protected]>

Co-authored-by: Kevin Su <[email protected]>
Co-authored-by: Eduardo Apolinario <[email protected]>

* Add custom token source that allows preemptive token refresh (#262)

* Add custom token source that allows preemptive token refresh
Signed-off-by: Sean Lin <[email protected]>

* Switch to apimachinery jitter
Signed-off-by: Sean Lin <[email protected]>

* Switch back to max because min doesnt make sense
Signed-off-by: Sean Lin <[email protected]>

* fix lint
Signed-off-by: Sean Lin <[email protected]>

* goimport
Signed-off-by: Sean Lin <[email protected]>

* minor fix
Signed-off-by: Sean Lin <[email protected]>

* Rename and trim config
Signed-off-by: Sean Lin <[email protected]>

* Introduce cluster assignment attribute to execution spec (#255)

* checkpoint

Signed-off-by: Katrina Rogan <[email protected]>

* one more

Signed-off-by: Katrina Rogan <[email protected]>

* good riddance

Signed-off-by: Katrina Rogan <[email protected]>

* one more

Signed-off-by: Katrina Rogan <[email protected]>

* revert

Signed-off-by: Katrina Rogan <[email protected]>

* DataProxy Service Definition (#261)

* Added missing docs and mocks for DataProxy (#269)

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Additional execution properties as matchable entities. (#253)

* proposal

Signed-off-by: Katrina Rogan <[email protected]>

* comments

Signed-off-by: Katrina Rogan <[email protected]>

* Fixed the import and updated tooling

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Updating the golang version to 1.7

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* feedback

Signed-off-by: Prafulla Mahindrakar <[email protected]>

Co-authored-by: Prafulla Mahindrakar <[email protected]>

* Adding authType in pflags and also updated docs with valid values (#268)

* Adding authType in pflags and also updated docs with valid values

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* removed the default message

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* fixed spaces in mesages

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Add cache_status and logs to ExternalResrouce proto (#271)

* added protobuf dependencies to doc generation

Signed-off-by: Daniel Rammer <[email protected]>

* added cache_status and logs to ExternalResourceInfo proto

Signed-off-by: Daniel Rammer <[email protected]>

* Generate mocks

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Add is_parent & is_dynamic to node execution events (#272)

* Rename DataProxy to DataProxyService for consistency (#273)

* Rename DataProxy to DataProxyService for consistency

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Update client go

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* bump

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Grammar edit  (#267)

* Minor grammar changes

Grammar fix
Changed Flyte IDL to Flyteidl
Updated index.rst files

Signed-off-by: SmritiSatyanV <[email protected]>

* Fix docs (#275)

Signed-off-by: Samhita Alla <[email protected]>

* Introduce taints and tolerations to cluster assignment  (#276)

* Add workflow execution JSON schema (#270)

* Add workflow execution jsonschema

Signed-off-by: Kevin Su <[email protected]>

* Updated schema

Signed-off-by: Kevin Su <[email protected]>

* Added Readme

Signed-off-by: Kevin Su <[email protected]>

* updated

Signed-off-by: Kevin Su <[email protected]>

* Add ContentMD5 parameter for CreateUploadLocationRequest (#278)

* Add ContentMD5 parameter for CreateUploadLocationRequest

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Change contentMD5 type to bytes

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* comments

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Rename suffix to filename

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* fix docs (#282)

Signed-off-by: Samhita Alla <[email protected]>

* Adding event_version to TaskExecutionEvent and TaskExecutionClosure (#279)

* added event_version to TaskExecutionEvent

Signed-off-by: Daniel Rammer <[email protected]>

* added event_version to TaskExecutionClosure

Signed-off-by: Daniel Rammer <[email protected]>

* generated protos

Signed-off-by: Daniel Rammer <[email protected]>

* Add ServiceHttpEndpoint to be returned to clients (#277)

* Add ServiceHttpEndpoint to be returned to clients

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* add make generate check

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Add missing docs (#284)

Signed-off-by: Kevin Su <[email protected]>

* update docs (#285)

Signed-off-by: Kevin Su <[email protected]>

* Go Linter Github  action (#283)

* Usess re-usable linter

Signed-off-by: Yukesh Kumar <[email protected]>

Co-authored-by: Yuvraj <[email protected]>

* Fix pattern to hide generated code in github (#288)

* Fix pattern to hide generated code in github

Signed-off-by: Eduardo Apolinario <[email protected]>

* Hide changes to rst files

Signed-off-by: Eduardo Apolinario <[email protected]>

Co-authored-by: Eduardo Apolinario <[email protected]>

* #major Release 1.0.0 (#290)

* #major Release 1.0.0

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Update flytestdlib to 1.0.0

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Bump github tag action version (#292)

Signed-off-by: Katrina Rogan <[email protected]>

* Changed default npm/python versions to valid semver (#293)

Signed-off-by: Nick Müller <[email protected]>

* Add interruptible override to execution #minor (#287)

* Added interruptible flag for execution to protos

Signed-off-by: Nick Müller <[email protected]>

* Changed execution interruptible flag to use regular bool

Signed-off-by: Nick Müller <[email protected]>

* Changed interruptible overrides to use BoolValue
Allows distinguishment between a value not being provided and the go zerovalue false

Signed-off-by: Nick Müller <[email protected]>

* Interruptible flag comment/documentation

Signed-off-by: Nick Müller <[email protected]>

* Interruptible flag comment/documentation

Signed-off-by: Nick Müller <[email protected]>

* Removed unescaped quotes from proto comments
Included documentation for  in proto generation

Signed-off-by: Nick Müller <[email protected]>

* Re-generated documentation

Signed-off-by: Nick Müller <[email protected]>

* Updates  go version used  in the pipeline (#294)

* Adds linter for go code

Signed-off-by: Yukesh Kumar <[email protected]>

* Usess re-usable linter

Signed-off-by: Yukesh Kumar <[email protected]>

* changes wrt comment

Signed-off-by: Yukesh Kumar <[email protected]>

* changes wrt comments

Signed-off-by: Yukesh Kumar <[email protected]>

* updates go version in the yaml file

Signed-off-by: Yukesh Kumar <[email protected]>

Co-authored-by: Yuvraj <[email protected]>

* Dockerized docs gen (#295)

* Dockerized docs generation
Now uses protoc-gen-dec Docker image instead of running protoc using the protoc-gen-doc plugin locally

Signed-off-by: Nick Müller <[email protected]>

* Minor cleanup of doc templates
Replaced double with single quotes in proto comments (cause escaping issues with protoc-gen-doc text renderer)

Signed-off-by: Nick Müller <[email protected]>

* Set locale override during protos/docs generation to ensure consistent sorting

Signed-off-by: Nick Müller <[email protected]>

* Added deck_uri to NodeExecutionEvent (#286)

* Added deck metadata in TaskMetadata

Signed-off-by: Kevin Su <[email protected]>

* Updated IDL

Signed-off-by: Kevin Su <[email protected]>

* Updated IDL

Signed-off-by: Kevin Su <[email protected]>

* wip

Signed-off-by: Kevin Su <[email protected]>

* Updated IDL

Signed-off-by: Kevin Su <[email protected]>

* Updated gitignore

Signed-off-by: Kevin Su <[email protected]>

* Updated idl

Signed-off-by: Kevin Su <[email protected]>

* Updated comment

Signed-off-by: Kevin Su <[email protected]>

* Add CreateDownloadLocation service

Signed-off-by: Kevin Su <[email protected]>

* Add CreateDownloadLocation service

Signed-off-by: Kevin Su <[email protected]>

* Add CreateDownloadLocation service

Signed-off-by: Kevin Su <[email protected]>

* Add deck_uri in NodeExecutionClosure

Signed-off-by: Kevin Su <[email protected]>

* update

Signed-off-by: Kevin Su <[email protected]>

* update

Signed-off-by: Kevin Su <[email protected]>

* nit

Signed-off-by: Kevin Su <[email protected]>

* updated

Signed-off-by: Kevin Su <[email protected]>

* updated

Signed-off-by: Kevin Su <[email protected]>

* nit

Signed-off-by: Kevin Su <[email protected]>

* add deck_uri to task execution event

Signed-off-by: Kevin Su <[email protected]>

* Remove deck_uri from task execution

Signed-off-by: Kevin Su <[email protected]>

* Remove deprecated sidecar job (#302)

Signed-off-by: Katrina Rogan <[email protected]>

* Grpc default service config (#301)

* add load balancer config

Signed-off-by: Babis Kiosidis <[email protected]>

* add policies doc link

Signed-off-by: Babis Kiosidis <[email protected]>

* add available load balancing policies comment

Signed-off-by: Babis Kiosidis <[email protected]>

* add descriptive comment about missing balancer value

Signed-off-by: Babis Kiosidis <[email protected]>

* describe load balancing policy behaviour

Signed-off-by: Babis Kiosidis <[email protected]>

* import balancers

Signed-off-by: Babis Kiosidis <[email protected]>

* skip linting for empty imports

Signed-off-by: Babis Kiosidis <[email protected]>

* lint nolint

Signed-off-by: Babis Kiosidis <[email protected]>

* rely on grpc client functionality for the configs
and remove todo

Signed-off-by: Babis Kiosidis <[email protected]>

* dont repeat the comment

Signed-off-by: Babis Kiosidis <[email protected]>

* change load balancing config to serviceConfig

Signed-off-by: Babis Kiosidis <[email protected]>

* Change config name and remove package preloading

Signed-off-by: Hongxin Liang <[email protected]>

* Test it

Signed-off-by: Hongxin Liang <[email protected]>

Co-authored-by: Babis Kiosidis <[email protected]>

* feat: buf integration for proto release  (#300)

* fix: integrated buf

Signed-off-by: Yuvraj <[email protected]>

* added protoc-gen-swagger

Signed-off-by: Yuvraj <[email protected]>

* buf mod update

Signed-off-by: Yuvraj <[email protected]>

* fix proto path in ci

Signed-off-by: Yuvraj <[email protected]>

Co-authored-by: Yuvraj <[email protected]>

* Add go_package for datacatalog.proto (#304)

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* fix flyteidl version upgrade automation (#305)

Signed-off-by: Yuvraj <[email protected]>

* Update execution.proto (#309)

* Update execution.proto

[Slack conversation](https://flyte-org.slack.com/archives/C01P3B761A6/p1657755908722469)
Signed-off-by: SmritiSatyanV <[email protected]>

* Make generate

Signed-off-by: Kevin Su <[email protected]>

Co-authored-by: Kevin Su <[email protected]>

* Ray task proposal (#308)

* Ray plugin

Signed-off-by: Kevin Su <[email protected]>

* Ray plugin

Signed-off-by: Kevin Su <[email protected]>

* Update cluster spec

Signed-off-by: Kevin Su <[email protected]>

* Update cluster spec

Signed-off-by: Kevin Su <[email protected]>

* Update proto

Signed-off-by: Kevin Su <[email protected]>

* Update proto

Signed-off-by: Kevin Su <[email protected]>

* Update proto

Signed-off-by: Kevin Su <[email protected]>

* Update proto

Signed-off-by: Kevin Su <[email protected]>

* Update proto

Signed-off-by: Kevin Su <[email protected]>

* Fix lint error

Signed-off-by: Kevin Su <[email protected]>

* Remove shutdown_after_job_finishes

Signed-off-by: Kevin Su <[email protected]>

* More comments

Signed-off-by: Kevin Su <[email protected]>

* Allow passing in authentication client secret as an environment variable (#311)

* Read client secret from env var first since the location has a default  (#312)

Signed-off-by: Katrina Rogan <[email protected]>

* Adding device  authorization oauth2 flow  (#313)

* Added config skip opening browser for pkce auth

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* added docs

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* increased the default browser session timeout to 2min

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Adding device flow idl changes

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* Adding device flow orchestration

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* lint fixes

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* nit

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* fixes

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* refactor and feedback

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* nit

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* test fixes

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* more test fixes

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* feedback

Signed-off-by: Prafulla Mahindrakar <[email protected]>

Signed-off-by: Prafulla Mahindrakar <[email protected]>

* update buf (#316)

Signed-off-by: Katrina Rogan <[email protected]>

Signed-off-by: Katrina Rogan <[email protected]>

* Use grpc client interceptors to properly check for auth requirement (#315)

* Use grpc client interceptors to properly check for auth requirement

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Some refactor and add unit tests

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* PR Comments

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* lint

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* unit tests

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Attempt a random port

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Listen to localhost only

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* PR Comments

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* use chain unary interceptor instead

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* only log on errors

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Attempt to disable error check

Signed-off-by: Haytham Abuelfutuh <[email protected]>

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Expose newAuthInterceptor to allow other clients to create authenticating clients (#319)

Signed-off-by: Haytham Abuelfutuh <[email protected]>

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Add CheckpointUri to TaskNodeMetadata (#322)

* Add CheckpointUri to TaskNodeMetadata

Signed-off-by: Andrew Dye <[email protected]>

* Update swagger-codegen-cli image for arm

Signed-off-by: Andrew Dye <[email protected]>

Signed-off-by: Andrew Dye <[email protected]>

* Project level attributes via matchable resource (#320)

* copy pasta

Signed-off-by: Yee Hing Tong <[email protected]>

* generate

Signed-off-by: Yee Hing Tong <[email protected]>

* fix comment

Signed-off-by: Yee Hing Tong <[email protected]>

Signed-off-by: Yee Hing Tong <[email protected]>

* ClusterAssignment stores cluster pool for execution placement (#321)

* Cluster Pool execution placement

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Make generate

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* wip

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* make generate using new image

Signed-off-by: Iaroslav Ciupin <[email protected]>

* make generate again

Signed-off-by: Iaroslav Ciupin <[email protected]>

* make generate

Signed-off-by: Iaroslav Ciupin <[email protected]>

* make generate go1.18

Signed-off-by: Iaroslav Ciupin <[email protected]>

* make generate go1.18.5

Signed-off-by: Iaroslav Ciupin <[email protected]>

* backward-compatible ClusterAssignment

Signed-off-by: Iaroslav Ciupin <[email protected]>

* make generate

Signed-off-by: Iaroslav Ciupin <[email protected]>

Signed-off-by: Haytham Abuelfutuh <[email protected]>
Signed-off-by: Iaroslav Ciupin <[email protected]>
Co-authored-by: Haytham Abuelfutuh <[email protected]>

* Update token source provider to optionally call GetPublicClientConfig (#326)

* Update token source provider

Signed-off-by: Katrina Rogan <[email protected]>

* GH actions incident

Signed-off-by: Katrina Rogan <[email protected]>

Signed-off-by: Katrina Rogan <[email protected]>

* Make call to auth metadata optional (#327)

* Make call to auth metadata optional

Signed-off-by: Katrina Rogan <[email protected]>

* debug

Signed-off-by: Katrina Rogan <[email protected]>

* revert

Signed-off-by: Katrina Rogan <[email protected]>

* undeprecate

Signed-off-by: Katrina Rogan <[email protected]>

* Add unit tests

Signed-off-by: Katrina Rogan <[email protected]>

* codecov is not very good

Signed-off-by: Katrina Rogan <[email protected]>

Signed-off-by: Katrina Rogan <[email protected]>

* Adding audience field to device flow token request (#314)

* Add createworkflow failure proto #minor (#331)

* Added GateNode message (#296)

* added GateNode message

Signed-off-by: Daniel Rammer <[email protected]>

* added signal service

Signed-off-by: Daniel Rammer <[email protected]>

* fleshed out Signal service

Signed-off-by: Daniel Rammer <[email protected]>

* updated signal service with a GetOrCreateSignal and SetSignal API

Signed-off-by: Daniel Rammer <[email protected]>

* updated signal service api to use GetOrCreate semantics

Signed-off-by: Daniel Rammer <[email protected]>

* added the ListSignals API

Signed-off-by: Daniel Rammer <[email protected]>

* fixed SignalListResponse proto name

Signed-off-by: Daniel Rammer <[email protected]>

* set HTTP API parameters

Signed-off-by: Daniel Rammer <[email protected]>

* generated protos

Signed-off-by: Daniel Rammer <[email protected]>

* documented GateNode

Signed-off-by: Daniel Rammer <[email protected]>

* updated signal list API

Signed-off-by: Daniel Rammer <[email protected]>

* filled out signal list api

Signed-off-by: Daniel Rammer <[email protected]>

* addressing pr comments on docs

Signed-off-by: Daniel Rammer <[email protected]>

* added an output variable name to the signal condition

Signed-off-by: Daniel Rammer <[email protected]>

* reworded signal condition docs

Signed-off-by: Daniel Rammer <[email protected]>

* added ApproveCondition to GateNode

Signed-off-by: Daniel Rammer <[email protected]>

* removed authOpt

Signed-off-by: Daniel Rammer <[email protected]>

* fixed types

Signed-off-by: Daniel Rammer <[email protected]>

* updated doc_gen_deps to fix docs generation

Signed-off-by: Daniel Rammer <[email protected]>

Signed-off-by: Daniel Rammer <[email protected]>

* Generate javascript proto (#336)

* Cache eviction for single execution in datacatalog and flyteadmin (#318)

* Added datacatalog endpoint for updating artifacts
Existing artifacts can have their associated ArtifactData overwritten

Signed-off-by: Nick Müller <[email protected]>

* datacatalog.UpdateArtifact returns ArtifactID

Signed-off-by: Nick Müller <[email protected]>

* Added skip_cache override to ExecutionSpec, LaunchPlanSpec and WorkflowExecutionConfig

Signed-off-by: Nick Müller <[email protected]>

* Added CatalogCacheStatus for skipped cache lookups

Signed-off-by: Nick Müller <[email protected]>

* Added skip_cache flag to ExecutionRelaunchRequest

Signed-off-by: Nick Müller <[email protected]>

* Renamed skip_cache flag to overwrite_cache

Signed-off-by: Nick Müller <[email protected]>

Signed-off-by: Nick Müller <[email protected]>

* Doc Hub proposal (#303)

* Add description entity

Signed-off-by: Kevin Su <[email protected]>

* Add id

Signed-off-by: Kevin Su <[email protected]>

* wip

Signed-off-by: Kevin Su <[email protected]>

* few update

Signed-off-by: Kevin Su <[email protected]>

* update service

Signed-off-by: Kevin Su <[email protected]>

* update service

Signed-off-by: Kevin Su <[email protected]>

* Add description entity to task and workflow

Signed-off-by: Kevin Su <[email protected]>

* update des entity

Signed-off-by: Kevin Su <[email protected]>

* update

Signed-off-by: Kevin Su <[email protected]>

* nit

Signed-off-by: Kevin Su <[email protected]>

* typo

Signed-off-by: Kevin Su <[email protected]>

* address comment

Signed-off-by: Kevin Su <[email protected]>

* update idl

Signed-off-by: Kevin Su <[email protected]>

* list description entity

Signed-off-by: Kevin Su <[email protected]>

* make generate

Signed-off-by: Kevin Su <[email protected]>

* make generate

Signed-off-by: Kevin Su <[email protected]>

* Update service name

Signed-off-by: Kevin Su <[email protected]>

* update endpoint

Signed-off-by: Kevin Su <[email protected]>

* update endpoint

Signed-off-by: Kevin Su <[email protected]>

* remove create_description_entity endpoint

Signed-off-by: Kevin Su <[email protected]>

* Add description to task/workflow

Signed-off-by: Kevin Su <[email protected]>

* update

Signed-off-by: Kevin Su <[email protected]>

* address comments

Signed-off-by: Kevin Su <[email protected]>

* address comments

Signed-off-by: Kevin Su <[email protected]>

* fix tests

Signed-off-by: Kevin Su <[email protected]>

* nit

Signed-off-by: Kevin Su <[email protected]>

* fix test

Signed-off-by: Kevin Su <[email protected]>

* Add id.resource_type

Signed-off-by: Kevin Su <[email protected]>

* undeclared name: ResourceType

Signed-off-by: Kevin Su <[email protected]>

* update wrong code manually

Signed-off-by: Kevin Su <[email protected]>

* Fixed tests

Signed-off-by: Kevin Su <[email protected]>

* Fixed tests

Signed-off-by: Kevin Su <[email protected]>

Signed-off-by: Kevin Su <[email protected]>
Co-authored-by: Yee Hing Tong <[email protected]>

* Add a more restrict CreateDownloadLink API (#332)

* Add a more restrict CreateDownloadLink API

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* generate

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* regenerate?

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Add generates_deck to task metadata

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Remove ARTIFACT_TYPE_OUTPUT_METADATA

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* PR Comments

Signed-off-by: Haytham Abuelfutuh <[email protected]>

Signed-off-by: Haytham Abuelfutuh <[email protected]>

* Use buf to generate python stubs (#346)

* Buf python migration

Signed-off-by: Eduardo Apolinario <[email protected]>

* Generate pyi files

Signed-off-by: Eduardo Apolinario <[email protected]>

* Add venv to .gitgnore

Signed-off-by: Eduardo Apolinario <[email protected]>

* Use buf to generate python stubs

Signed-off-by: Eduardo Apolinario <[email protected]>

* Use buf docker image to generate stubs

Signed-off-by: Eduardo Apolinario <[email protected]>

* Add stubs produced by call to `buf generate` using buf's docker image

Signed-off-by: Eduardo Apolinario <[email protected]>

* Add pyi files

Signed-off-by: Eduardo Apolinario <[email protected]>

* Use buf locally

Signed-off-by: Eduardo Apolinario <[email protected]>

* Verify that generated protos by using buf

Signed-off-by: Eduardo Apolinario <[email protected]>

* Copy generated code to a separate artifact

Signed-off-by: Eduardo Apolinario <[email protected]>

* Move back to go_generate.yml@master

Signed-off-by: Eduardo Apolinario <[email protected]>

Signed-off-by: Eduardo Apolinario <[email protected]>
Co-authored-by: Eduardo Apolinario <[email protected]>

* Fix python package and publish typing information (#347)

* Add __init__.py files to generated stubs

Signed-off-by: Eduardo Apolinario <[email protected]>

* Publish stubs in the package

Signed-off-by: Eduardo Apolinario <[email protected]>

* Include __init__.py in verification workflow

Signed-off-by: Eduardo Apolinario <[email protected]>

* Bump versions of remote plugins

Signed-off-by: Eduardo Apolinario <[email protected]>

Signed-off-by: Eduardo Apolinario <[email protected]>
Co-authored-by: Eduardo Apolinario <[email protected]>

* Metadata tags (#348)

* add tags to metadata

Signed-off-by: Yee Hing Tong <[email protected]>

* make generate

Signed-off-by: Yee Hing Tong <[email protected]>

Signed-off-by: Yee Hing Tong <[email protected]>

* Comment annotations and doc generation #minor (#350)

* Comment swagger annotations in proto

Signed-off-by: Eduardo Apolinario <[email protected]>

* Regenerate proto stubs

Signed-off-by: Eduardo Apolinario <[email protected]>

Signed-off-by: Eduardo Apolinario <[email protected]>
Co-authored-by: Eduardo Apolinario <[email protected]>

* Add Databricks config to Spark config (#351)

* databricks plugin

Signed-off-by: Kevin Su <[email protected]>

* update comment

Signed-off-by: Kevin Su <[email protected]>

* Use struct instead of string

Signed-off-by: Kevin Su <[email protected]>

* Add token

Signed-off-by: Kevin Su <[email protected]>

* nit

Signed-off-by: Kevin Su <[email protected]>

* add instance name

Signed-off-by: Kevin Su <[email protected]>

* add instance name

Signed-off-by: Kevin Su <[email protected]>

Signed-off-by: Kevin Su <[email protected]>

* Add inital `dask` plugin IDL (#339)

Signed-off-by: Bernhard Stadlbauer <[email protected]>

* added comments to the catalog reservation API (#355)

Signed-off-by: Daniel Rammer <[email protected]>

Signed-off-by: Daniel Rammer <[email protected]>

* dockerizing buf call (#356)

Signed-off-by: Daniel Rammer <[email protected]>

Signed-off-by: Daniel Rammer <[email protected]>

* Add raw claims to user info response (#357)

Signed-off-by: Katrina Rogan <[email protected]>

Signed-off-by: Katrina Rogan <[email protected]>

* Adding configurable audience property for flyte clients (#329)

* Adding configurable audience property for flyte clients

Signed-off-by: pmahindrakar-oss <[email protected]>

* changed the const audience to audienceKey

Signed-off-by: pmahindrakar-oss <[email protected]>

* fixed unit tests

Signed-off-by: pmahindrakar-oss <[email protected]>

* fixed unit test

Signed-off-by: pmahindrakar-oss <[email protected]>

* nit

Signed-off-by: pmahindrakar-oss <[email protected]>

* feedback

Signed-off-by: pmahindrakar-oss <[email protected]>

* refactored unit tests

Signed-off-by: pmahindrakar-oss <[email protected]>

* Added UseAudienceFromAdmin property to force pull audience from admin config. Default is false and expects clients to pass it

Signed-off-by: pmahindrakar-oss <[email protected]>

* Added test for expected number of calls to the public admin endpoint

Signed-off-by: pmahindrakar-oss <[email protected]>

* fixed the tests

Signed-off-by: pmahindrakar-oss <[email protected]>

Signed-off-by: pmahindrakar-oss <[email protected]>

* Added template configuration to task template (#358)

* added template to task template

Signed-off-by: Daniel Rammer <[email protected]>

* updated docs

Signed-off-by: Daniel Rammer <[email protected]>

* updated pod_template_name location to TaskMetadata proto message

Signed-off-by: Daniel Rammer <[email protected]>

---------

Signed-off-by: Daniel Rammer <[email protected]>

* bumping go version to 1.19 (#363)

Signed-off-by: Daniel Rammer <[email protected]>

* gen (#359)

Signed-off-by: Katrina Rogan <[email protected]>
Co-authored-by: Katrina Rogan <[email protected]>

* Adding support for structured dataset (#369)

Signed-off-by: pmahindrakar-oss <[email protected]>

* added dynamic_job_spec_uri to dynamic workflow metadata and node execution closure (#360)

Signed-off-by: Daniel Rammer <[email protected]>

* Use TokenCache in ClientCredentialsTokenSourceProvider (#377)

* Init customTokenSource.refreshTime (#381)

Signed-off-by: Andrew Dy…
eapolinario pushed a commit that referenced this issue Sep 13, 2023
* Minor grammar changes

Grammar fix
Changed Flyte IDL to Flyteidl
Updated index.rst files

Signed-off-by: SmritiSatyanV <[email protected]>
Signed-off-by: Eduardo Apolinario <[email protected]>
@eapolinario eapolinario reopened this Nov 2, 2023
@github-actions github-actions bot removed the stale label Nov 3, 2023
@hamersaw hamersaw added exo backlogged For internal use. Reserved for contributor team workflow. labels Nov 9, 2023
pvditt pushed a commit that referenced this issue Dec 29, 2023
* Minor grammar changes

Grammar fix
Changed Flyte IDL to Flyteidl
Updated index.rst files

Signed-off-by: SmritiSatyanV <[email protected]>
@sshardool
Copy link
Contributor

We have need for this feature as well, is anyone working on this currently ?
Would be happy to collaborate.

eapolinario pushed a commit to eapolinario/flyte that referenced this issue Apr 30, 2024
austin362667 pushed a commit to austin362667/flyte that referenced this issue May 7, 2024
robert-ulbrich-mercedes-benz pushed a commit to robert-ulbrich-mercedes-benz/flyte that referenced this issue Jul 2, 2024
troychiu pushed a commit that referenced this issue Jul 8, 2024
## Overview
When kicking off fast tasks, we typically have to do a second round of task evaluation before a worker is available, which adds latency to the initial task runs while the worker(s) come up. This change keeps an in-memory cache of tasks waiting on a worker so that when the first one comes up, we can opportunistically enqueue the owning workflow for evaluation and avoid a ~10s delay.

I chose to use a service-wide lock, which trades off some lock contention for reduced complexity. This is acceptable since we already grab the service-wide `queuesLock` when discovering a new worker (call to `Heartbeat`).

## Test Plan
~- [ ] Haven't added any unittests yet. Wanted to get feedback on the approach~
Going to defer unittests to the broad pass @hamersaw is doing

- [x] Ran locally and verified that with the change tasks do not require a second round  

Without the enqueue call (2s delay from worker registered -> send task)
```
"2024-05-13T16:42:22-07:00"
"adding pending owner flytesnacks-development/feb7da731f60c482db2d for task feb7da731f60c482db2d-n0-0 on queue 4fc648840f89c02"
"2024-05-13T16:42:22-07:00"
"offering task feb7da731f60c482db2d-n0-0 on queue 4fc648840f89c02"
"2024-05-13T16:42:22-07:00"
"offering task feb7da731f60c482db2d-n0-0 on queue 4fc648840f89c02"
"2024-05-13T16:42:22-07:00"
"offering task feb7da731f60c482db2d-n0-0 on queue 4fc648840f89c02"
"2024-05-13T16:42:23-07:00"
"worker 6b772a8b-7748-4819-99ea-140086ca27af registered with queue 4fc648840f89c02"
"2024-05-13T16:42:25-07:00"
"offering task feb7da731f60c482db2d-n0-0 on queue 4fc648840f89c02"
"2024-05-13T16:42:25-07:00"
"sending task feb7da731f60c482db2d-n0-0 to worker 6b772a8b-7748-4819-99ea-140086ca27af on queue 4fc648840f89c02"
```

With enqueue call (same second)
```
"2024-05-13T16:48:25-07:00"
"adding pending owner flytesnacks-development/f96f3fe69ae744129ab3 for task f96f3fe69ae744129ab3-n0-0 on queue 4fc648840f89c02"
"2024-05-13T16:48:25-07:00"
"offering task f96f3fe69ae744129ab3-n0-0 on queue 4fc648840f89c02"
"2024-05-13T16:48:25-07:00"
"offering task f96f3fe69ae744129ab3-n0-0 on queue 4fc648840f89c02"
"2024-05-13T16:48:25-07:00"
"offering task f96f3fe69ae744129ab3-n0-0 on queue 4fc648840f89c02"
"2024-05-13T16:48:26-07:00"
"worker abb8da76-b4f2-44a2-8fe9-c577f682c914 registered with queue 4fc648840f89c02"
"2024-05-13T16:48:26-07:00"
"offering task f96f3fe69ae744129ab3-n0-0 on queue 4fc648840f89c02"
"2024-05-13T16:48:26-07:00"
"sending task f96f3fe69ae744129ab3-n0-0 to worker abb8da76-b4f2-44a2-8fe9-c577f682c914 on queue 4fc648840f89c02"
```

## Rollout Plan (if applicable)
No planning to put this behind a config (although code potentially move `maxPendingOwnersPerQueue` to a config and treat 0 as disabled). Will bring to cloud and deploy in the coming days

## Upstream Changes
Should this change be upstreamed to OSS (flyteorg/flyte)? If so, please check this box for auditing. Note, this is the responsibility of each developer. See [this guide](https://unionai.atlassian.net/wiki/spaces/ENG/pages/447610883/Flyte+-+Union+Cloud+Development+Runbook/#When-are-versions-updated%3F).
- [ ] To be upstreamed
@nikp1172
Copy link

nikp1172 commented Sep 3, 2024

This seems to be a critical feature. We should have option to set concurrency policy https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/#concurrency-policy

This is a standard and the current implementation in flyte makes it unfit for a large type of cron jobs.

I can have workflows which CANNOT run in parallel -> a new one should simply be cancelled or waited or replaced. There should be an option for that.


Also, an incident occurred where cluster was down for some time (a few hours) and as soon as cluster was up - it was flooded with workflows (from cron), although according to the nature of workflow -> running it once is sufficient (after downtime). Now I need to manually cleanup all the workflows in the cluster.

@katrogan
Copy link
Contributor

katrogan commented Sep 3, 2024

hi @nikp1172 do you mind checking to see if #5659 would be a possible solution for your needs (or please leave any comments if there are issues you notice)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backlogged For internal use. Reserved for contributor team workflow. enhancement New feature or request exo
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants