-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refined Node execution/validation; flow variable substitution #57
Conversation
Made all Parameters one line per argument for readability/version control. Increases number of lines for 'small' parameters, but should make any future changes clearer.
`validation` defined in parent Node class. It checks all options are valid, raising `ParameterValidationError` if not. Calls to `node.validate()` are removed from `pyworkflow` when updating/adding node or edge between Nodes. A call to `validate` occurs when the POST route is hit for updating a Node to check any changes to the configuration are valid.
Removes `__init__` from FilterNode, changes `execute` to raise `NotImplementedError()` in not present.
All Nodes we have implemented required the defined number of input nodes, so logic can be simplified to whether two values are equal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is phenomenal refactoring both from an extensibility standpoint and general DRYness. Awesome work!
I am having a little trouble keeping all the flow variable- and options-wrangling methods straight in my head. We should think about a clear way to present the approach (within our class diagrams or elsewhere) for milestone 3 and the final defense.
Am I understanding correctly that local flow vars are just regular nodes in the graph, which are caught when parsing the predecessor input? Whereas global ones aren't connected to anything, but are available from within the node config form to override a particular parameter? If you could give me a high-level list of what needs to happen on the front end for this to be supported, that would help a ton! If I am understanding correctly, this includes:
|
The list you have hits most of the high-level needs of the front-end. I made a few in-depth comments to elaborate on these (and the endpoints to hit) in issues #64 and #65. In general, your understanding of how the flow variables work is correct. A Workflow contains:
The differentiator between the two is whether a Node has the Global variables are defined in a separate |
Factors out some duplicate code. PR #57 adds a `to_json()` method that might be able to replace the `extract_node_info()` method here. TBD.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a great refactor! I added a minor comment and after running seems like everything is good. I will keep going through the changes but I am fine merging this.
""" | ||
display_name = "Flow Control" | ||
|
||
def execute(self, predecessor_data, flow_vars): | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this raise a NotImplementedError instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FlowNodes don't really have execute behavior like the other Nodes; their main logic lies in the get_replacement_value()
method instead, so I don't think a NotImplementedError
is needed here.
This also allows for the children (only the StringNode
) to avoid adding empty execute()
methods.
Node execution/validation
The main Node class now implements a validation method that checks all Parameters, calling their
option.validate()
methods. The calls to validate withinpyworkflow/workflow.py
have also been removed. This addresses the use case mentioned where a Node could potentially be invalid when added to the workspace. Now, Nodes are validated upon saving the configuration, or the 'node update' endpoint.Also changed is the order of operations for Node execution. The main logic in the
Workflow
class has been simplified to:The thinking behind moving input data validation and flow variable substitution into the Workflow class, is anyone developing a custom Node would not have to duplicate this work. Even for the built-in Nodes there was a lot of duplication where each Node validated input and parameters.
The
validate_input_data
and newly-namedget_execution_options
were moved from theNodeUtils
class into the mainNode
class. This approach seems cleaner to me, but there may be use cases I'm not considering.Flow variable substitution
There are currently no front-end options to create or pass these flow variables around, but this PR should handle most of the back-end functionality and updates the endpoint responses for the front-end to use. Currently, flow variables work by adding the following attributes to a Node (for example
ReadCsv
, written here in JSON:On execution, the back-end would substitute the
default_value
stored in the globalFlowNode
with ID "2", for whatever the current value is for "sep" in theReadCsvNode
. Should the user want to revert "sep" back to whatever stored value was there, they would remove the flow variable (perhaps a checkbox, or empty selection from a dropdown?).To pull this information in to the "Node Configuration" pop-up in the front-end, the action would need to call the
GET /node/{node_id}
endpoint which has been updated to return the Node's information as well as any flow variable options (all global variables, and any local variables connected as predecessors).TODO:
This PR updates all Nodes to use the
Parameter
classes @reddigari introduced in #53. OnlyRead/WriteCsv
and theJoinNode
have been updated to use these Parameters during execution; the rest still pass in**self.options
) which may cause issues.There is no validation to ensure a FlowNode matches the Parameter type of a given option. E.g., there are no checks to prevent a
StringNode
value from replacing aBooleanParameter
value.Integration with front-end. Nested-forms might be the solution to provide the JSON data the back-end currently is using, but both back/front might need some changes as we integrate.