Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running a phase in backgroud #1068

Open
eli-zr opened this issue Oct 23, 2022 · 7 comments
Open

Running a phase in backgroud #1068

eli-zr opened this issue Oct 23, 2022 · 7 comments

Comments

@eli-zr
Copy link
Contributor

eli-zr commented Oct 23, 2022

There are times when you have a system that allows testing different sub-systems independently, so there is a benefit in being able to run phases in parallel in order to optimize total test duration. For example, when you burn FW to one component, you can still test a different one.

From my understanding it is not a built-in capability and I wonder if this was previously discussed.
Is this capability aligned with the architecture or does it break it somehow?

Also, any idea to achieve it would be great.

Thanks.

@gtpalmer
Copy link
Contributor

This is an issue that greatly interests me as well. I have at least one use case where I've had to do the ugly hack of spawning multiple threads in a single phase.

My understanding is that the way threading works in the test executor and the way the framework talks to the station server it would probably be very difficult to make this refactor.

I also however think this would be a hugely beneficial feature.

For now, you can consider an approach of parallelism at the test vs phase level. That should work, but still doesn't play too well with the station server.

@gtpalmer
Copy link
Contributor

I would add that in addition to and somewhat related to supporting parallelism would be an easier way of defining test flow. Nesting phase nodes essentially creates a DAG and we could think about a more sophisticated execution approach where rather than just traversing the tree in order and executing phases sequentially, we instead create pools to execute independent phases and basically have each phase be triggered to execute immediately when the phases it depends on complete.

Building the infra from the ground up in this way might also lend itself towards better logging/post analysis. Imagine a test record that preserves the DAG structure rather than just giving you a flattened list of phases you can't really correlate back to the original phase node descriptors.

@eli-zr
Copy link
Contributor Author

eli-zr commented Oct 25, 2022

Thanks for your comments @gtpalmer.
Especially on production lines, time is money, so I agree this kind of support would be beneficial.

I've considered using Test objects for this purpose, but this would make the management of our complete test more complex (managing the threads, logging callbacks, station manager etc.). I'll go with this option eventually, if I find no other solution.

@glados-verma
Copy link
Collaborator

Good points here, thanks all for commenting.

We are also interested in parallel phases, and looked into how it could be done into OpenHTF. The conclusion was that it would basically require rewriting large parts of the OpenHTF executor to support true parallel phases. At that point, we would also want to redesign some of the APIs to better work with potentially parallel phases. Currently we're not planning to do such a change for OpenHTF.

A couple of workarounds can be used:

  1. Running entire tests in parallel - think of a "multi-up" test bench.
  2. Managing your own threads from inside a phase - a bit clunky as you've to manage your own threads and synchronization.

If and when there's a ground-up rework, parallel phases would definitely be on the feature list.

@arsharma1
Copy link
Collaborator

This is possible with the PhaseNode implementations I added a few years back.

This can be accomplished with a new Node type:
ParallelPhaseNode, a PhaseCollection that has a list of phases to run in parallel. It should run each node it finds in a separate thread. If there are nested collections (i.e. sequences or groups), those collections should run in order.

The parallel node needs an option to determine how fatal errors affect the parallel processing; either it should stop parallel phases immediately or continue running.

The implementation could be handled by breaking up the TestExecutor a bit to allow for parallel copies, creating copies for new nodes. It's especially important for there to be slightly different TestState objects for the parallel phases.

@gtpalmer
Copy link
Contributor

gtpalmer commented Nov 3, 2022

@arsharma1 Do you have code for this parallel phase collection or is it just hypothetical? Seems like a possible solution - though it could still be problematic in terms of sharing the same test state and displaying updates on the webfront (what would the currently running phase be?).

More generally I agree with fact that the test_api interaction would have to be tweaked. My proposal is along these lines:

  1. Add some infrastructure to create "local" test APIs whose state can be merged after a phase and/or collection is executed.
  2. To make it easier to add new types of phase nodes, and even have user defined nodes, have the execution of the phase node be defined by the node itself.
  3. A phase node (collection or descriptor) can return a record object directly. This object would always have a top level result enum, and always be serializable. This you can easily merge it into a test record object or an outer node record.
  4. Update the executor to basically just execute a phase node and then merge the node record and test API into the global test state.

@gtpalmer
Copy link
Contributor

gtpalmer commented Nov 3, 2022

I'd love to work on a prototype of this if people are interested, though it's complicated right now because I had been working for Argo AI... I'm not sure if I'll be using openhtf at my next role.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants