-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix normed dtype #557
base: main
Are you sure you want to change the base?
Fix normed dtype #557
Conversation
Pull Request Test Coverage Report for Build 13205728880Details
💛 - Coveralls |
use sed binning for histogram computation
9962e78
to
ae464dd
Compare
src/sed/core/processor.py
Outdated
) | ||
else: | ||
self._normalization_histogram = normalization_histogram_from_timed_dataframe( | ||
self._timed_dataframe, | ||
axis, | ||
self._binned.coords[axis].values, | ||
self._config["dataframe"]["timed_dataframe_unit_time"], | ||
hist_mode=self.config["binning"]["hist_mode"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems repeated. Probably can go out of the loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the structure now to repeat less code
|
||
Returns: | ||
xr.DataArray: Calculated normalization histogram. | ||
""" | ||
bins = df[axis].map_partitions( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this removed due to the updated dask version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am using our optimized binning now for the timed dataframe. This is somewhat faster, does the sequential binning using the num_cores parameter, and shows the progress bar. The previous solution used the pandas cut to define bins, which requires bin edges rather than bin centers as our function.
I once checked that they produce very similar results (a very tiny difference was there, I think, because of different inclusion/exclusion of the bin edges into either left or right bin).
Sets the dtype of normalized data to that of unnormalized data.
Currently, it gets the dtype of the normalization histogram