Use essreduce loaders #114

nvaytet · 2024-03-08T11:00:05Z

Fixes #112

Note: CI is failing because essreduce does not exist as a package yet

EDIT by JL:
Also fixes #106, #109, #110

Note:
We have changed the place where we merge events from multiple files in the workflow.
We used to merge the events early, into a single data array. This meant we had to make sure that sample and detector positions were the same for all runs.
Now we compute Q and wavelength for numerator and denominator for each file, and merge just before normalization.

jl-wynen · 2024-03-08T12:02:24Z

@nvaytet Can you target the PR on the namespace pkg branch? Then the diff only shows the relevant changes. Merging that branch will retarget the PR automatically onto main.

nvaytet · 2024-03-08T12:10:48Z

@jl-wynen I forgot to remove the old loaders...

SimonHeybrock · 2024-03-11T03:38:07Z

src/ess/loki/io.py

+    # Note here we specify the name instead of using
+    # ess.reduce.nexus.extract_detector_data because we need the name to put the
+    # events back into the original data group.
+    key = f'{detector_name}_events'
+    events = _preprocess_data(
+        out[key],
+        sample_position=raw_sample['position'],
+        source_position=raw_source['position'],
    )
+    if detector_name in DETECTOR_BANK_RESHAPING:
+        events = DETECTOR_BANK_RESHAPING[detector_name](events)
+    out[key] = events


Can you explain your comment, and why we need this? Can't these transformations be applied after extracting the detector data?

So you're saying we could insert an additional step after the RawData[Sample] has been extracted, where we would reshape and patch (add necessary coordinates, etc.) the data, and output something called PreparedData that would be just before the MaskedData in the chain?

SimonHeybrock · 2024-03-11T03:39:30Z

src/ess/loki/io.py

+    out = nexus.load_monitor(file_path=file_path, monitor_name=monitor_name)
+    key = f'{monitor_name}_events'
+    out[key] = _preprocess_data(out[key], source_position=raw_source['position'])


Similar question to above, why do we need to preprocess before extracting the monitor data? This makes this brittle, as we still hard-code some naming convention.

jl-wynen · 2024-03-11T13:34:22Z

ESSreduce has been released

Was renamed in ESSreduce.

…ing data

… use-essreduce-loaders

…before the normalization

SimonHeybrock · 2024-03-18T13:48:40Z

docs/user-guide/isis/zoom.ipynb

    "providers = providers + (\n",
    "    isis.data.transmission_from_background_run,\n",
    "    isis.data.transmission_from_sample_run,\n",
    ")\n",
    "pipeline = sciline.Pipeline(providers, params=params)\n",
-    "pipeline.set_param_table(masks)"
+    "# pipeline.set_param_table(masks)\n",


Commented code?

SimonHeybrock · 2024-03-18T13:50:03Z

requirements/base.in

 plopp
 pythreejs
-sciline>=24.2.0
+sciline>=23.9.1


Why the version regression?

I don't know, @jl-wynen ?

It happened in this commit: e757525 because I updated dependencies. This replaced the bound because it was set in 825f4ad in base.in instead of pyproject.toml.

ok I fixed it in the pyproject.toml.

SimonHeybrock · 2024-03-18T13:55:01Z

src/ess/isissans/components.py

@@ -33,7 +28,7 @@ def apply_component_user_offsets_to_raw_data(
    data: RawData[ScatteringRunType],
    sample_offset: Optional[SampleOffset],
    detector_bank_offset: Optional[DetectorBankOffset],
-) -> RawDataWithComponentUserOffsets[ScatteringRunType]:
+) -> PatchedData[ScatteringRunType]:


In #59 (comment) you argued that the naming I chose, ConfiguredRawData, was unclear. I would argue that PatchedData is considerably less clear than that, since "patching" has multiple meanings. In particular, we are not patching sections of data together. How about going back to something like ConfiguredData after all?

So we are adding missing coordinates, variances on the events, and possibly adding offsets to some of the coordinates (in the case of ISIS data).

Either we find a name that covers all 3 well enough (I don't really get the above from ConfiguredData, but PatchedData is indeed quite general), or we should make them into separate steps?

I don't like making separate steps because it means that all of these steps could be optional, and we'd have to have e.g. a dummy step for applying position offsets in loki, just because it is needed in isis.

I bascially wanted a name that would mean "ready for the rest of the reduction workflow", but was never able to find a good name :-(

How about just dropping the Raw prefix? This function would turn RawData into Data. But I still don't see what is so wrong with configured-data.

I tried ConfiguredAndCompleteData but I still don't really like it. Now I'm thinking of ConfiguredReducibleData to hint that it has the necessary pieces to proceed with the rest of the reduction.

But I still don't see what is so wrong with configured-data.

I think Configured captures the notion that some configurations were applied by the user, e.g. positions offsets, but not so much the fetching of the sample/source positions so that we can later perform conversion to wavelength. That's why I thought we were "patching" different parts of the NeXus file together. But I agree that PatchedData is just as vague.

Regarding why I thought we could be more specific in your PR was because it was to describe only the step where user offsets were applied to the data. Now that different things can happen in this step, we do need to go back to something more general.

SimonHeybrock · 2024-03-18T14:01:59Z

src/ess/sans/i_of_q.py

+    return FinalSummedQ[ScatteringRunType, IofQPart](
+        reduce(_merge_events_or_histograms, data.values())
+    )


Consider using https://scipp.github.io/generated/functions/scipp.reduce.html.

I tried something like

reducer = sc.reduce(data.values()) out = reducer.bins.concat() if reducer.bins is not None else reducer.sum()

but reducer.bins is never None, even if the elements inside data.values() are dense data.

So I went back to functools.reduce.

Why can't you use it as if data.values()[0].bins is None? Or all(x.bins is None for x in data.values())?

To be clear: Doing it in the way you chose is extremely inefficient if you have many files (please give it a try!). You are basically adding an $N^2$ term to the required memory access (reads+writes+allocations).

Ok, I thought about getting the first element in data.values() but I didn't think it was worth it. I was under the impression that using either functools or sc.reduce was equivalent, also in terms of performance.

Actually, I have to qualify my statement: When summing, I think the current implementation of sc.reduce is actually at a disadvantage, compared to functools.reduce, but for concatenation it is $N^2$.

SimonHeybrock · 2024-03-18T14:04:50Z

src/ess/sans/masking.py

 def apply_pixel_masks(
-    data: RawData[ScatteringRunType],
-    masks: Optional[sciline.Series[PixelMaskFilename, PixelMask]],
+    data: TofData[ScatteringRunType],
+    masked_ids: Optional[sciline.Series[PixelMaskFilename, MaskedDetectorIDs]],
 ) -> MaskedData[ScatteringRunType]:


Not totally happy with this change, as it will prevent using manual masks that are constructed, e.g., using a condition of the pixel positions, etc. Can the MaskedDetectorIDs -> PixelMask conversion be done in a separate step?

The issue was that it was using the RawData[SampleRun] to get the detector ids, and using it also for the BackgroundRun. This led to an infinite recursion error when trying to set a parameter table for using events from multiple runs, I think because it was trying to get a sciline.Series[Filename[BackgroundRun], RawData[SampleRun]] which it could not materialize.

I could have a PixelMask[ScatteringRunType] as an alternative? I think that would work?

it will prevent using manual masks that are constructed, e.g., using a condition of the pixel positions

I think in this case you would have to use a different provider anyway, because the masks are sciline.Series[PixelMaskFilename, MaskedDetectorIDs], i.e. they depend on PixelMaskFilename, so most probably wouldn't be created from a condition?

FYI I'm still running into issues with PixelMask[ScatteringRunType].

…p compile multi

…ory issue?

…, just include the provider by default

…unction

nvaytet added 4 commits March 8, 2024 11:42

use data loaders from essreduce

4ac2d9b

update loki notebook

301beb0

add dummy sample provider in tests

bdb8606

isort

3247a63

nvaytet mentioned this pull request Mar 8, 2024

Generic NeXus loaders scipp/essreduce#10

Merged

This was linked to issues Mar 8, 2024

Loaders look for NXsample in NXinstrument instead of NXentry #109

Closed

Avoiding loading source and sample in monitor and detector loads? #110

Closed

nvaytet changed the base branch from main to namespace-package March 8, 2024 12:07

nvaytet added 2 commits March 8, 2024 13:14

remove old loaders and types

ecfc773

black notebook

922bfc8

jl-wynen approved these changes Mar 8, 2024

View reviewed changes

add small comment

1fdee10

SimonHeybrock reviewed Mar 11, 2024

View reviewed changes

Base automatically changed from namespace-package to main March 11, 2024 09:21

Merge branch 'main' into use-essreduce-loaders

9fd9277

nvaytet and others added 8 commits March 11, 2024 14:34

use data loaders from essreduce

ae5faa6

update loki notebook

3cecf4c

add dummy sample provider in tests

6aa13a9

isort

0fba1e8

remove old loaders and types

e4a6632

black notebook

64b931a

add small comment

23adb2c

Update default TransformationPath

38fcbac

Was renamed in ESSreduce.

jl-wynen force-pushed the use-essreduce-loaders branch from 74b5d6c to 38fcbac Compare March 11, 2024 13:34

jl-wynen and others added 3 commits March 11, 2024 14:44

Depend on essreduce

e757525

start adding intermediate steps for extracting events and pre-process…

313125f

…ing data

Merge branch 'use-essreduce-loaders' of github.com:scipp/esssans into…

6e64fb4

… use-essreduce-loaders

nvaytet marked this pull request as draft March 11, 2024 14:16

jl-wynen mentioned this pull request Mar 12, 2024

Guideline adherence scipp/essreflectometry#40

Merged

nvaytet and others added 8 commits March 13, 2024 14:58

begin moving merging of events towards the end of the workflow, just …

239aa33

…before the normalization

compute beam center separately

925ea83

more use of common types in isis workflows

2f422ef

begin fixing tests: sans2d_reduction

29f5893

fix isis reduction and loki tests

8c78b4b

cleanup

2b58318

cleanup again and formatting

9e6cdf2

fix beam center finder notebook and api reference in docs

b763fa0

nvaytet marked this pull request as ready for review March 18, 2024 13:46

nvaytet requested a review from SimonHeybrock March 18, 2024 13:46

SimonHeybrock reviewed Mar 18, 2024

View reviewed changes

nvaytet and others added 9 commits March 18, 2024 16:51

remove commented code in zoom notebook

90a3b3f

use sc.reduce

36c56c3

bump version of sciline in pyproject instead of base.in and re-run pi…

75d6f8a

…p compile multi

change PatchedData to ConfiguredAndCompleteData

7914363

revert to using functools reduce as reducer.bins is never None

cd299b4

try to not build docs in parallel to see if it resolve a possible mem…

091f694

…ory issue?

sans2d and zoom workflow do not need to talk about beam center finder…

52b3c05

…, just include the provider by default

use sc.reduce instead of functools.reduce

b1988bd

ConfiguredAndComplete -> ConfiguredReducible

6f24816

SimonHeybrock approved these changes Mar 19, 2024

View reviewed changes

use less memory in docs and fix midpoints coordinate of direct beam f…

41a906f

…unction

nvaytet merged commit 0371984 into main Mar 20, 2024
3 checks passed

nvaytet deleted the use-essreduce-loaders branch March 20, 2024 13:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use essreduce loaders #114

Use essreduce loaders #114

nvaytet commented Mar 8, 2024 •

edited

Loading

jl-wynen commented Mar 8, 2024 •

edited

Loading

nvaytet commented Mar 8, 2024

SimonHeybrock Mar 11, 2024

nvaytet Mar 11, 2024

SimonHeybrock Mar 11, 2024

SimonHeybrock Mar 11, 2024

jl-wynen commented Mar 11, 2024

SimonHeybrock Mar 18, 2024

SimonHeybrock Mar 18, 2024

nvaytet Mar 18, 2024

jl-wynen Mar 18, 2024

nvaytet Mar 18, 2024

SimonHeybrock Mar 18, 2024

nvaytet Mar 18, 2024 •

edited

Loading

SimonHeybrock Mar 19, 2024

nvaytet Mar 19, 2024

SimonHeybrock Mar 18, 2024

nvaytet Mar 18, 2024

SimonHeybrock Mar 19, 2024 •

edited

Loading

SimonHeybrock Mar 19, 2024 •

edited

Loading

nvaytet Mar 19, 2024

SimonHeybrock Mar 19, 2024

SimonHeybrock Mar 18, 2024

nvaytet Mar 18, 2024

nvaytet Mar 18, 2024

Use essreduce loaders #114

Use essreduce loaders #114

Conversation

nvaytet commented Mar 8, 2024 • edited Loading

jl-wynen commented Mar 8, 2024 • edited Loading

nvaytet commented Mar 8, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jl-wynen commented Mar 11, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nvaytet Mar 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SimonHeybrock Mar 19, 2024 • edited Loading

Choose a reason for hiding this comment

SimonHeybrock Mar 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nvaytet commented Mar 8, 2024 •

edited

Loading

jl-wynen commented Mar 8, 2024 •

edited

Loading

nvaytet Mar 18, 2024 •

edited

Loading

SimonHeybrock Mar 19, 2024 •

edited

Loading

SimonHeybrock Mar 19, 2024 •

edited

Loading