Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the option to slice out straws/tubes/layers/pixels #57

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion src/esssans/beam_center_finder.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
mask_wavelength,
to_Q,
)
from .i_of_q import merge_spectra
from .i_of_q import dimension_slicing, merge_spectra
from .logging import get_logger
from .normalization import iofq_denominator, normalize, solid_angle
from .types import (
Expand Down Expand Up @@ -173,6 +173,7 @@ def _iofq_in_quadrants(
detector_to_wavelength,
solid_angle,
calibrate_positions,
dimension_slicing,
]
params = {}
params[UncertaintyBroadcastMode] = UncertaintyBroadcastMode.upper_bound
Expand Down
3 changes: 2 additions & 1 deletion src/esssans/conversions.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
Numerator,
RawMonitor,
RunType,
SlicedMaskedData,
WavelengthMask,
WavelengthMonitor,
)
Expand Down Expand Up @@ -200,7 +201,7 @@ def calibrate_positions(
# for RawData, MaskedData, ... no reason to restrict necessarily.
# Would we be fine with just choosing on option, or will this get in the way for users?
def detector_to_wavelength(
detector: CalibratedMaskedData[RunType],
detector: SlicedMaskedData[RunType],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above... losing information that this is calibrated.

Could the slicing be combined with masking, in terms of the graph structure, to avoid this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the slicing has to be done after the masking because we need all the detector pixels to run the beam center finder, and the masking needs to be done before doing the beam center finder.

If we run the beam center finder on a single layer we will get a different offset.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there actually a compute cost reason for changing this in the workflow? Or could we consider, e.g., returning reduced data per pixel (or in event mode)? Maybe that would take too much memory?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in the PR description, it was because of performance. When Judith asked for this I said she could just inspect the results at the end in a single layer or tube. But she said it would be nice if it was faster when playing around with parameters.

But I am open to alternatives

Copy link
Member

@SimonHeybrock SimonHeybrock Jan 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, isn't loading the files generally the slowest part? If we slice in the workflow it will mean the complete files will be loaded N times of we want to look at N tubes?

Edit: If this is mainly about the direct-beam iteration I suppose file-load is not the performance bottleneck. Would it be possible to load files, make a selection, and call the direct-beam workflow on that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can always cache parts of the workflow by setting an intermediate result on the pipeline?
E.g.

res = pipeline.compute(CalibrateMaskedData[SampleRun])
pipeline[CalibrateMaskedData[SampleRun]] = res

But I am not sure if this is dangerous and shouldn't be recommended to the users?

If this is mainly about the direct-beam iteration I suppose file-load is not the performance bottleneck.

Note that we are calling the pipeline inside a loop (and it fact we are computing it twice inside the loop: once for full wavelength range and once for wavelength bands). I think that means we are loading all the files at every loop.
We should probably have better caching inside the iteration loop?

Presumably we can cache the solid angle, the numerator, the transmission fraction and monitors, and only evolve the direct beam?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing a similar thing can be done to speed up the beam_center_finder_from_iofq?

Copy link
Member

@SimonHeybrock SimonHeybrock Jan 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am just trying that, but it only helps a bit (3 seconds per iteration). There is another source of bad performance in the wavelength-bands pipeline, I am trying to track that down now.

In other words: We should check if this scientist request is actually an XY problem (this solution was suggested because performance is bad).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My guess is this the merge_spectra function is the bottleneck: https://github.com/scipp/esssans/blob/main/src/esssans/i_of_q.py#L238

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are correct. But it is not the event-mode one, but just the histogram mode. I have some ideas how to improve, will try to look into that tomorrow.

graph: ElasticCoordTransformGraph,
) -> CleanWavelength[RunType, Numerator]:
return CleanWavelength[RunType, Numerator](
Expand Down
21 changes: 21 additions & 0 deletions src/esssans/i_of_q.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,12 @@
from .types import (
BackgroundRun,
BackgroundSubtractedIofQ,
CalibratedMaskedData,
CleanDirectBeam,
CleanMonitor,
CleanQ,
CleanSummedQ,
DetectorSlices,
DimsToKeep,
DirectBeam,
IofQ,
Expand All @@ -24,6 +26,7 @@
QBins,
RunType,
SampleRun,
SlicedMaskedData,
UncertaintyBroadcastMode,
WavelengthBands,
WavelengthBins,
Expand Down Expand Up @@ -129,6 +132,23 @@ def resample_direct_beam(
return CleanDirectBeam(func(wavelength_bins, midpoints=True))


def dimension_slicing(
data: CalibratedMaskedData[RunType],
slices: Optional[DetectorSlices] = None,
) -> SlicedMaskedData[RunType]:
"""
Slice out a subset of the detector tubes/straws/layers/pixels.

Note that the slicing must be done after masking and computing the beam center, if
not the results will be different between sliced and non-sliced data.
"""
out = data
if slices is not None:
for dim, sl in slices.items():
out = out[dim, sl]
return SlicedMaskedData[RunType](out)


def _process_wavelength_bands(
wavelength_bands: Optional[WavelengthBands],
wavelength_bins: WavelengthBins,
Expand Down Expand Up @@ -297,4 +317,5 @@ def subtract_background(
resample_direct_beam,
merge_spectra,
subtract_background,
dimension_slicing,
)
2 changes: 1 addition & 1 deletion src/esssans/loki/io.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@
DETECTOR_BANK_RESHAPING = {
params[NeXusDetectorName]: lambda x: x.fold(
dim='detector_number', sizes=dict(layer=4, tube=32, straw=7, pixel=512)
).flatten(dims=['tube', 'straw'], to='straw')
)
}


Expand Down
4 changes: 2 additions & 2 deletions src/esssans/normalization.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@
from scipp.core import concepts

from .types import (
CalibratedMaskedData,
CleanDirectBeam,
CleanMonitor,
CleanSummedQ,
Expand All @@ -21,6 +20,7 @@
NormWavelengthTerm,
Numerator,
RunType,
SlicedMaskedData,
SolidAngle,
Transmission,
TransmissionFraction,
Expand All @@ -34,7 +34,7 @@


def solid_angle(
data: CalibratedMaskedData[RunType],
data: SlicedMaskedData[RunType],
Comment on lines 36 to +37
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit unhappy about this change. I think the previous version, which said "we compute the solid angle based on calibrated data" was clearly expressing the behavior and the intent of the pipeline. As it is now, it could be anything. If we later change the order of slicing and calibration this will be completely broken.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would simply changing the name to SlicedCalibratedMaskedData help or is that not what you mean?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would help partially. I still wonder though if this is correctly expressing the intent. Solid angle could be computed before slicing or after, so this choice is due to cost?

pixel_shape: DetectorPixelShape[RunType],
transform: LabFrameTransform[RunType],
) -> SolidAngle[RunType]:
Expand Down
8 changes: 8 additions & 0 deletions src/esssans/types.py
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,10 @@ class FileList(sciline.Scope[RunType, list], list):
BeamStopRadius = NewType('BeamStopRadius', sc.Variable)
"""Radius of the beam stop"""

DetectorSlices = NewType('DetectorSlices', dict)
"""Dictionary of slices that will be applied to the detector data for selecting only
certain tubes/straws/layers/pixels."""

# 3 Workflow (intermediate) results


Expand Down Expand Up @@ -194,6 +198,10 @@ class CalibratedMaskedData(sciline.Scope[RunType, sc.DataArray], sc.DataArray):
"""Raw data with pixel-specific masks applied and calibrated pixel positions"""


class SlicedMaskedData(sciline.Scope[RunType, sc.DataArray], sc.DataArray):
"""Data that has potentially had some tubes/straws/layers/pixels sliced out"""


class NormWavelengthTerm(sciline.Scope[RunType, sc.DataArray], sc.DataArray):
"""Normalization term (numerator) for IofQ before scaling with solid-angle."""

Expand Down
Loading