-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ymir imaging I/O helpers #38
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this for some non-compliant Nexus files, or why can't we use ScippNexus?
src/ess/imaging/io.py
Outdated
from pathlib import Path | ||
from typing import NewType | ||
|
||
import jax.numpy as np |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing in project dependencies, or is this unintentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it was unintentional...
src/ess/imaging/io.py
Outdated
] | ||
x_length, y_length = detector_dataset["value"].shape[1:] | ||
return HistogramModeDetectorData( | ||
sc.sort( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think such a sort can be quite costly. Is this required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to slice the data based on the time so it should be sorted along the time.
I think the data is already almost sorted in most cases.
src/ess/imaging/io.py
Outdated
image_detector_name: ImageDetectorName, | ||
histogram_mode_detectors_path: HistogramModeDetectorsPath = DEFAULT_HISTOGRAM_PATH, | ||
) -> HistogramModeDetectorData: | ||
with File(file_path, mode="r") as f: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use h5py.File
for clarity which File
this is.
If this is NeXus, why can't we use ScippNexus?
The 2d detector images are stored in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used scippnexus wherever I could
src/ess/imaging/io.py
Outdated
from pathlib import Path | ||
from typing import NewType | ||
|
||
import jax.numpy as np |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it was unintentional...
src/ess/imaging/io.py
Outdated
] | ||
x_length, y_length = detector_dataset["value"].shape[1:] | ||
return HistogramModeDetectorData( | ||
sc.sort( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to slice the data based on the time so it should be sorted along the time.
I think the data is already almost sorted in most cases.
fe57450
to
b677655
Compare
Hmm...! I was wrong. I thought scippnexus raised an error when loading the images but it seems okay. I replaced the h5py to scippnexus. |
I just need to update the dependencies to run the CI. |
956a954
to
06b50aa
Compare
e228af4
to
6f4bc9f
Compare
We decided not to use data array to separate images, but the dictionary. |
src/ess/imaging/io.py
Outdated
def _assign_coord_or_make_one( | ||
da: sc.DataArray, coord_name: str, coord: sc.Variable | None | ||
) -> None: | ||
da.coords[coord_name] = coord or sc.arange( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This only works when coord
is None
or a scalar without unit. Otherwise it raises a unit error or dimension error. (Note that coord or ...
calls bool(coord)
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed it...! Now it makes a dimensionless coordinate only if there is no coordinate.
src/ess/imaging/io.py
Outdated
if min_coord is not None and max_coord is not None: | ||
return da[dim, min_coord:max_coord] | ||
elif min_coord is not None and max_coord is None: | ||
return da[dim, min_coord:] | ||
elif min_coord is None and max_coord is not None: | ||
return da[dim, :max_coord] | ||
else: | ||
return da |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if min_coord is not None and max_coord is not None: | |
return da[dim, min_coord:max_coord] | |
elif min_coord is not None and max_coord is None: | |
return da[dim, min_coord:] | |
elif min_coord is None and max_coord is not None: | |
return da[dim, :max_coord] | |
else: | |
return da | |
return da[dim, min_coord:max_coord] |
You are reimplementing the behaviour of slice
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah... so I don't have to handle None
...? That's great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed it!
if (dim in da.coords.keys()) and (coord is not None): | ||
da.coords[dim] = coord | ||
elif (dim not in da.coords.keys()) and (coord is None): | ||
da.coords[dim] = sc.arange(dim=dim, start=0, stop=da.sizes[dim]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changed the logic. Now it raises if (dim not in da.coords) and (coord is not None)
. Previously, it would insert the new coord.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we don't want a new coordinate if it already exists.
I just wanted to make sure it has a coordinate before slicing.
So, if the loaded data already has a coordinate, you can change the cropping range according to that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But that contradicts the code. If a coord exists and coord is not None
, you overwrite the existing coord. The case I pointed out above would insert a coord if there is non already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I added that when we want to insert the coordinate but it should be in the loader function not in here.. I'll remove that part....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done...!
src/ess/imaging/io.py
Outdated
sorted_logs = sc.sort(log, 'time') | ||
min_time = sc.datetime(da.coords['time'].min().value - 1, unit='ns') | ||
max_time = sc.datetime(da.coords['time'].max().value + 1, unit='ns') | ||
time_edges = sc.concat((min_time, sorted_logs.coords['time'], max_time), 'time') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks fishy. Why are min_time
and max_time
computed from da.coords['time']
but the edges (except first and last) are constructed from log
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was because of the out-or-range-values and assumming that logs are always less than the actual time coordinate but I think it can go wrong pretty easily...
I updated it so that it just slice the dataset according to the log.
Related to #34
Reference: https://github.com/scipp/ess-legacy/blob/683747138f1775d106809bf17bf5f56b75d6e0d3/imaging/helper_funcs.py#L41-L68
TODO