-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Output Ports of an Operator to Write Storage #3295
Open
Xiao-zhen-Liu
wants to merge
64
commits into
master
Choose a base branch
from
xiaozhen-use-output-port-for-storage
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…orage # Conflicts: # core/amber/src/main/python/core/architecture/packaging/output_manager.py # core/amber/src/main/python/core/runnables/main_loop.py # core/amber/src/main/scala/edu/uci/ics/amber/engine/architecture/pythonworker/PythonWorkflowWorker.scala
…orage # Conflicts: # core/amber/src/main/scala/edu/uci/ics/texera/web/resource/dashboard/user/workflow/WorkflowExecutionsResource.scala # core/amber/src/main/scala/edu/uci/ics/texera/web/service/ResultExportService.scala
…n-use-output-port-for-storage # Conflicts: # core/amber/src/main/scala/edu/uci/ics/amber/engine/architecture/scheduling/ScheduleGenerator.scala # core/amber/src/main/scala/edu/uci/ics/texera/web/resource/dashboard/user/workflow/WorkflowExecutionsResource.scala # core/amber/src/main/scala/edu/uci/ics/texera/web/service/ExecutionResultService.scala # core/amber/src/main/scala/edu/uci/ics/texera/web/service/ResultExportService.scala # core/amber/src/main/scala/edu/uci/ics/texera/workflow/WorkflowCompiler.scala # core/workflow-core/src/main/scala/edu/uci/ics/amber/core/storage/VFSURIFactory.scala # core/workflow-operator/src/main/scala/edu/uci/ics/amber/operator/SpecialPhysicalOpFactory.scala # core/workflow-operator/src/main/scala/edu/uci/ics/amber/operator/sink/ProgressiveSinkOpExec.scala
Yicong-Huang
requested changes
Mar 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general looks good! left code comments.
core/amber/src/main/python/core/architecture/packaging/output_manager.py
Outdated
Show resolved
Hide resolved
core/amber/src/main/python/core/architecture/packaging/output_manager.py
Outdated
Show resolved
Hide resolved
core/amber/src/main/python/core/architecture/packaging/output_manager.py
Outdated
Show resolved
Hide resolved
core/amber/src/main/python/core/architecture/packaging/output_manager.py
Outdated
Show resolved
Hide resolved
core/amber/src/main/python/core/storage/runnables/port_result_writer.py
Outdated
Show resolved
Hide resolved
...mber/src/main/scala/edu/uci/ics/amber/engine/architecture/scheduling/ScheduleGenerator.scala
Outdated
Show resolved
Hide resolved
...la/edu/uci/ics/amber/engine/architecture/scheduling/resourcePolicies/ResourceAllocator.scala
Outdated
Show resolved
Hide resolved
...ala/edu/uci/ics/amber/engine/architecture/worker/managers/OutputPortResultWriterThread.scala
Outdated
Show resolved
Hide resolved
core/amber/src/main/scala/edu/uci/ics/texera/web/service/ExecutionResultService.scala
Outdated
Show resolved
Hide resolved
core/amber/src/main/scala/edu/uci/ics/texera/workflow/WorkflowCompiler.scala
Show resolved
Hide resolved
Yicong-Huang
reviewed
Mar 5, 2025
core/amber/src/main/scala/edu/uci/ics/amber/engine/architecture/scheduling/Region.scala
Outdated
Show resolved
Hide resolved
Yicong-Huang
reviewed
Mar 5, 2025
...la/edu/uci/ics/amber/engine/architecture/scheduling/resourcePolicies/ResourceAllocator.scala
Outdated
Show resolved
Hide resolved
Yicong-Huang
approved these changes
Mar 6, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
core/amber/src/main/python/core/architecture/packaging/output_manager.py
Outdated
Show resolved
Hide resolved
core/amber/src/main/python/core/architecture/packaging/output_manager.py
Outdated
Show resolved
Hide resolved
...mber/src/main/scala/edu/uci/ics/amber/engine/architecture/messaginglayer/OutputManager.scala
Outdated
Show resolved
Hide resolved
...mber/src/main/scala/edu/uci/ics/amber/engine/architecture/messaginglayer/OutputManager.scala
Show resolved
Hide resolved
...mber/src/main/scala/edu/uci/ics/amber/engine/architecture/messaginglayer/OutputManager.scala
Outdated
Show resolved
Hide resolved
...ala/edu/uci/ics/texera/web/resource/dashboard/user/workflow/WorkflowExecutionsResource.scala
Outdated
Show resolved
Hide resolved
core/amber/src/main/scala/edu/uci/ics/texera/web/service/ExecutionResultService.scala
Show resolved
Hide resolved
core/amber/src/main/scala/edu/uci/ics/texera/web/service/ExecutionResultService.scala
Show resolved
Hide resolved
core/amber/src/main/scala/edu/uci/ics/texera/workflow/WorkflowCompiler.scala
Outdated
Show resolved
Hide resolved
core/workflow-core/src/main/scala/edu/uci/ics/amber/core/storage/DocumentFactory.scala
Show resolved
Hide resolved
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Currently the Amber engine creates and uses sink operators to write the results of output ports of an operator. This design causes many problems, mainly because it alters the physical plan. Ideally a physical plan should not be changed once it is compiled.
This PR updates the design to use output ports of an operator instead of sink operators to write port results to storage. The changes implemented in this PR include:
GlobalPortIdentity
is moved fromRegion
toworkflow-core
so that it is accessible by all the modules.GlobalPortIdentity
that need view results. This set is passed along to the scheduler as part ofWorkflowContext.workflowSettings
. In the future, this information will be directly produced by the frontend instead of by the compiler.resourceConfig
.AssignPortRequest
is used to indicate whether an output port of a worker needs storage and to pass the storage URI information to a worker. Note this request is used for both input ports and output ports, and this PR only updates output ports. As a result, for input ports, empty storage URIs will be provided inAssignPortRequest
. In the future, after we also use input ports to read storage, we will also update and use these storage URIs.TODOs:
GlobalPortIdentity
for storage URIs