Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ewoks integration #358

Open
loichuder opened this issue Dec 5, 2024 · 7 comments
Open

Ewoks integration #358

loichuder opened this issue Dec 5, 2024 · 7 comments

Comments

@loichuder
Copy link

Hello 👋

As discussed yesterday, I tried to put in writing our plan for the Ewoks integration so that we are all in the same page. Feel free to comment if some things do not seem right.

Preamble

Ewoks workflows are made of Ewoks tasks (nodes) that execute computing operations and of links to pass data between the said tasks. An Ewoks workflow therefore represents a computation made of several steps and usually saved as a JSON following the Ewoks specification. In this JSON, steps can be viewed, rearranged, replaced or removed, using EwoksWeb for example.

image

Ewoks can take care of integration such as for online processing and remote execution. It also supports several execution engines including Orange that provide GUIs for interactive execution.

Ewoks and ImageD11

A typical ImageD11 pipeline goes through several steps. This sequence is given by the notebooks in nbGui/TDXRD :

  • It begins with segmentation of the peaks
  • Then, peak indexing follows

Note that each of these steps is made of several substeps where some back-and-forth is needed: do the processing, generate a plot, use it to evaluate the quality of the processing, change parameters if needed, and repeat.

These substeps are what we propose to implement as Ewoks tasks. Each one should come in addition with an Orange GUI to show the said plot and allow modify parameters to regenerate the plot. Thanks to Ewoks, these tasks can also be used headless to run the full processing.

The challenge then relies on the identification of the said substeps and on how the data should be passed around each substep.

Impact on ImageD11

Our plan is to build a separate project on the top of ImageD11, that is to use ImageD11 as a library. As a consequence, we don't plan to make core changes to ImageD11.

However, I foresee that we will end up requesting fine refining of the API of ImageD11 objects, most likely to expose additional information.

Ex: if a core ImageD11 function does not return the necessary values to generate the plot, we will have either to create a new function in ImageD11 or to expose (i.e. return) additional values from this function for the Ewoks task to be able to generate the plot as well. So no core change but an API change.

These API refinements will have to be discussed on a case-by-case basis with ImageD11 developers as the Ewoks project progresses.

Short-term plan

Our short-term project is to try to write a single Ewoks task for the segmentation (or one of its substep). By doing so, we will need to find solutions to issues that will be relevant in the whole project development, namely:

  • how inputs/outputs should be handled/loaded/saved
  • if current ImageD11 functions/objects expose the necessary information to reimplement operations in Ewoks
@jonwright
Copy link
Member

Thanks @loichuder , this all sounds good to me.

Taking the segmentation task as a first example, I have created a separate issue to remind us to update some documentation to help you in #359

Related to the EWOKS workflows I have in mind some previous works, in addition to the most recent notebooks that we use:

For workflows, the connections between these boxes seems to be the challenging part to get right. Adapting ImageD11 to read/write in some other formats like Nexus or cif or silx.io would be a useful cleanup for us.

Adapting the existing notebooks to be run under ewoks via papermill (or similar) looks appealing for us at ID11. Right now I like notebooks because users can use jupyterhub to see what they did.

@jonwright
Copy link
Member

Input from Wout yesterday: Ewoks already has a notebooktask which call papermill which then executes the notebook.

This is related to #334. Notebooks would need to be compatible with papermill to be able to run them as scripts, if we want to make them available for Ewoks users.

@loichuder
Copy link
Author

True, this could be a possibility but I fear the notebooks will need to go through significant changes to be able to make them executable (due to the plotting and interactive widgets inside). This explains why we are more inclined to extract the processing code of notebooks to scripts for now.

This being said, we can try to execute one notebook with papermill and see how it goes...

@jadball
Copy link
Contributor

jadball commented Jan 10, 2025

We agree that dedicated tasks + GUIs for standard data processing is the ultimate end goal - for scanning 3DXRD our initial transition period could be to use notebook-as-a-task, then we (as a team) slowly replace the notebooks with dedicated tasks.

@abmajith
Copy link

abmajith commented Jan 16, 2025

Follow-Up on Ewoks Task Draft

Hi @jadball , @jonwright @loichuder ,

Following our meeting on Monday with Carsten, Cedric, and myself, I have drafted the Ewoks tasks along with the parameters needed for each task. Please find the draft document attached.

I should mention that there will be some information flow between the tasks, which is reflected in the draft. The document may appear a bit cryptic at this stage. Therefore, we might need to schedule a meeting to go through it in detail. However, since you are the experts on ImageD11, you might already have a good understanding of it and could provide feedback on where I might have misplaced or misinterpreted the parameters.

Please note that I haven’t fully elaborated on the last two tasks yet. I plan to expand on these in the coming days.

Looking forward to your thoughts.

Best regards,
Abdul

ewoks_flow_task_3dxrd.pdf

Majith

@jonwright
Copy link
Member

Thanks @abmajith. There are a few things I would suggest to 're-phrase' for the document. Perhaps it would help to have this in source format within git somewhere (rather than PDF)? For the first pages:

  • 2.1 detector/motor names. Type = string. For example "eiger". Using a finite list enum of "allowed" names prevents people using the code elsewhere.
  • 2.1 folders setup: please follow Processing Parameter and File Management #373 here to get a common structure at ESRF
  • 2.2 Users will want to extend the image processing in the future to create their own pipelines. I miss the raw vs dark+flat then background in your pdf?
  • 2.4 See Use nexus detector files for spatial distortion (and poni) #327 , we should plan to add pyFAI distortion in nexus format
  • 2.4 Yes to making new parameters. Ideally a "wizard" that gets everything it find from the raw data (like pixel size, motor positions, etc) but lets the users over-ride or supply everything which is missing. This is where you would put a list of "known" motor names and ways to guess.

(I will try to go through the rest of it later)

@abmajith
Copy link

Thanks, @jonwright, for the review!

I’ll create the Git page for this document and check with @loichuder where to maintain this spec report—either in ewoks3dxrd or within ImageD11 itself.

Regarding your remarks on 2.2, yes, we’ll display raw vs. dark+flat, background, and the frame image in the same window, likely using tabs if the window space is too limited to accommodate all three images.

On 2.1, the file structure will remain consistent with the one produced by the ImageD11 notebook, or more precisely, with the Dataset class from ImageD11 that handles these peaks files and stores them as col3dfile. There won’t be any changes here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants