-
Notifications
You must be signed in to change notification settings - Fork 678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Preserve Full Trajectory Information in Parallel Analysis #4891
Comments
@yuxuanzhuang just wanted to say that I saw the PR and plan to review it tomorrow or on Saturday. as for the
The last one is a bigger change though, but it also seems that there are a lot of different properties already, that might be worth unifying under a common class. |
I like the idea of the global |
Hi @yuxuanzhuang, my apologies for delaying the review. I have looked at the PR and decided to take the discussion here. First, I completely understand the rationale behind this, and I agree that it must be implemented. However, there are indeed some issues with the naming, but perhaps generally with how Ideally (and also recalling my GSOC days), I'd love to have all the current-timestamp-related attributes in one namespace (either as a nested attribute, or simply with the same prefix), but that is too much to ask, and will break stuff. But we still can do it here -- the My problem with A sub-suggestion is to have something like But, I'd approach the documentation from that run/computational-group standpoint: there are two groups of parameters, whole-run-related and computational-group-related, and I'd explicitly list them as such in the docs. So action points:
For the last one, perhaps making indeed |
Oh, and the other thing -- if the sole purpose of that is to allow people who write custom parallelization code to iterate through the trajectory independently of the current slicer, why don't you write it as a custom method, instead of making them do |
Is your feature request related to a problem?
In analyses e.g.
DistanceMatrix
indiffusionmap
, we need to run analysis of a single frame over the full trajectory---that is sliced and defined bystart
,stop
, andstep
inrun()
. However, in the current parallel analysis implementation, information about the full trajectory is lost (only per-process slices are visible).Describe the solution you’d like
Firstly, in
_setup_frames
(which only runs in the main process), store the sliced trajectory information inself._global_slicer
. This makes it possible to retrieve global details such as the total number of frames.Secondly, in
self._compute
, also storeself._global_frame_index
so the analysis can track the global frame index for each frame, rather than just per-process slices.The following code will illustrate its usage
n_frames
andframe_index
are not the same for serial and parallel analysisRelated PR
#4745
The text was updated successfully, but these errors were encountered: