-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Work towards reusable CVCs #700
base: reusable-cvcs
Are you sure you want to change the base?
Conversation
After some thoughts I feel there is no way to make explicit gradients working correctly with reusable components, so I will disable it in this PR. |
This does not work as Colvars was not designed with automatic differentiation in mind.
The problem of
|
AST Giacomo said that variables_active_smp is used for making the colvar object appearing in the loop multiple times for the original parallelization scheme. As I understand, this means that I can use variables_active directly to build the AST instead of the duplicated items in variables_active_smp.
if (it_find == cvc_info_map.end()) { | ||
cvc_info_map.insert({parent, cvc_info{ | ||
// TODO: Here calc_cvc_values calls cvcs[i]->is_enabled() , which is is_enabled(int f = f_cv_active) | ||
// I know both f_cv_active and f_cvc_active are 0 but are they the same option?? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact that f_cv_active
and f_cvc_active
have the same numerical value has no consequence, they should never be used in the same context, because they are respectively only meaningful in a colvar
or cvc
object. The colvardeps
data of these two classes are non-overlapping. The relationship between them is a vertical (parent-child) dependency.
If we were to merge the two levels, then these two features would merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is that in
Line 1437 in 0f2d682
if (!cvcs[i]->is_enabled()) continue; |
is_enabled()
is called, and since there is no function parameter passed, so I think it would call Line 167 in 0f2d682
inline bool is_enabled(int f = f_cv_active) const { |
which checks
f_cv_active
instead of f_cvc_active
. My code is to follow what the original calls in calc_cvc_values
and I am not sure if I need to follow it the same way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, sorry. That is somewhat sloppy writing that I didn't remember well. It does rely on the "active" property being number 0.
Line 224 in 0f2d682
// NOTE that all feature enums should start with f_*_active |
But a class inheriting from colvardeps
could also override is_enabled()
and change this convention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much effort do you think it would be to remove the default argument for this virtual function? Remember that this is one of the "issues" that clang-tidy
was complaining about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be very little effort. I'm happy to do that if that helps in any way.
it != cv->variables_active_smp()->end(); ++it) { | ||
// TODO: Bad design! What will happen if CVC a is in a "colvar" block | ||
// that does not support total_force_calc, but is then reused in | ||
// another block that requires total_force_calc even if it supports Jacobian itself??? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see the problem here. Assuming a cvc can have several parents: the colvar that does require a total force can enable it in its children cvcs, even if other parents don't require (or support) it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. I just thought that if a colvar does not require the total force, then it would disable the corresponding feature of the children, but it seems the code do not check the dependency in that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the other way: disabled by default, and enabled on request - then disabled again if the refcount falls to zero.
lambda_fn, NULL, CKLOOP_NONE, NULL); | ||
} | ||
cvm::decrease_depth(); | ||
return cvm::get_error(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parallelization and checking the error
When I wrote and test new code, I found that NAMD may not exit cleanly (some other threads were still running) if one thread was exited. I think the implementation of colvarproxy_namd::error
or at least its use in the parallel region, needs to be revised to use a std::atomic<bool>
to setup an error "bit" instead of calling NAMD_die
directly. Then after all threads are finished and joined, we can use cvm::get_error()
to check the error bit and exit if there is an error.
There seems to be a misunderstanding here. To clarify, the spirit of |
Thanks for the clarifications. You are right that colvars/src/colvarcomp_combination.cpp Lines 58 to 60 in ff35f9c
Also, I cannot use add_child in colvardeps to declare that a sub-CVC is a child of the other, as add_child also does dependency checking, so I think that it is a bad design to check dependencies while calling the initialization function. Calling the init means that Colvars is still parsing options, and the abstract syntax tree is not completely built, but the dependencies are semantic things that should be done after the AST is completely built.
In my opinion, the following structure should be separated from Lines 155 to 162 in ff35f9c
to form the AST, and colvardeps should only be a feature-dependency checker, and not own the AST. Adding new children to the AST should not trigger the checker. In other words, it is better that colvardeps acts somewhat like an LLVM pass.
|
Thanks @HanatoK , I now understand your point better and I fully agree! To move the AST out of colvardeps and allow same-level dependencies, we need to deal with the fact that children objects of a CVC can be CVCs or atom groups, so those should be described by different parts of the feature tree. |
@HanatoK - this is a large and disruptive change set. I think there are items in there that we can agree on. Having an AST alongside the colvardeps class is one. Your initial point 2 was: Bypass the colvar class and take the cvc objects out to build the AST. I think that the colvar class should be completely removed; |
@jhenin's comment is good. There are multiple small fixes here that are definitely worth integrating, but they are mixed with others that pertain to broader design issues. I also agree that the AST could be the most immediate goal. We had discussed in the past the possibility of adding such functionality to If you are aware of suitable classes in the C++ standard library, can you suggest them? |
As far as I understand, the AST is just constructed as the same way as
I mark the PR as draft as it seems too disruptive and might need further discussions. |
@HanatoK I've now looked at this in more detail. This has a lot of merit. What it would need to be merge-ready is at least:
One question: can this be done without implementing the AST design you described just above? If yes, I would rather separate these issues as much as possible and merge code that works to avoid letting this branch diverge too much. Side note: I think you should be able to set your GH account so that the CI actions run in your fork, which would give us CI for this PR. |
Well, actually this PR was not intended to be merged into the master branch directly. Instead, I developed the code based on #644, generalized the idea of reusable CVCs and tried my best to solve the SMP issue.
This PR is based on #644 so I don't want it diverges from the
I tagged this PR with "request for comments" because I didn't think there would be a consensus about how to reuse CVCs, and I opened this PR mainly as a proof of concept to show that how to achieve reusable CVCs while retaining the idea of "distributing the CVCs over SMP threads". I will add the documentation if we have a consensus about reusability and parallelization.
I don't think so. It seems to me that @giacomofiorin likes the idea of moving some "special" CVCs out of the SMP loops and computing them serially. However, I think there will be more and more "special" CVCs relying on other CVCs and biases based on other biases in the future, so it is better to take a unified approach. As I have said in #709 (comment), there could be two approaches for parallelization, namely (i) distributing CVCs and biases over SMP threads, and (ii) using SMP threads to parallelize fine-grained loops like projecting hills, rotating atoms, calculating COMs and more. To achieve general reusability in (i), it has to use an AST or other similar things to determine the order of calculations of CVCs. Approach (ii) could be simpler as it only requires that the calculations of CVCs follow the order of their definitions appeared in the config file, which is what PLUMED does, although I suspect that building an AST could be more useful.
It does run in my fork (see https://github.com/HanatoK/colvars/actions/runs/9766071847/job/26958200729). |
I totally agree with having a unified approach, which would be a win-win for users and developers alike. But that would take significant effort, during which we would need to keep Although GROMACS and LAMMPS have somewhat predictable release schedules, they are also not in sync: one in the Winter, one in Summer. I guess this may be related to the times of the year when people prefer to be inside working in Stockholm vs. Albuquerque :-D And NAMD and VMD do not follow a regular schedule. I do not see the two approaches as antithetical, either: (i) was just easier to implement than (ii), and there was also less pressure to implement (ii) in NAMD, which already provided at least centers of mass computed using domain decomposition. |
My impression is that parallelizing both the calculations of CVs and their inner loops (COM, COG, and rotating positions) could make the code much more complicated. Approach (i) only works well with CPUs, while approach (ii) could be easily ported to GPUs or similar accelerators. Calculating two CVs simultaneously on two "GPU threads" of the same GPU device could kill the performance. |
It just reminds me that if we want to compute CVCs in parallel, we may need to lock the Lines 1487 to 1489 in 2c6c712
Because CVCs A and B may rely on C, and in C the atom group has a fitting group. In such case, it would be better to implement a lock attached to the atom group, the first one (either A or B) accessing the apply_force() of C should obtain the lock, while the second one should wait for the lock.
|
How do we move forward on this?
Either way, this remains a somewhat unwieldy branch but I think parts of it can be extracted and merged separately to make this one easier to follow. |
Well, I believe that no one would really like this PR. This PR uses a lot of hacks and workarounds in the backend to achieve reusable CVCs in SMP with limitations as less as possible (although I am not a big fan of distributing CVCs over threads), including:
colvardeps
looks like an AST but after playing with it I feel it is not a real one, and it really just checks the dependencies of features, and there is no true AST. It would be better if Colvars could be redesigned with a true AST and a dependency checker for it. The dependency checker should not own the AST;colvar
class and take thecvc
objects out to build the AST. I think that thecolvar
class should be completely removed;smp_lock
andsmp_unlock
incolvarproxy_namd
are implemented as creating and destroying locks, so I have changed them;colvar::cvc::modify_children_cvcs_atom_gradients
andpropagate_colvar_force
). When callingcalc_gradients
andapply_force
of a CVC consisting of sub-CVCs, it now propagates the gradients and forces to all its sub-CVCs;smp_lock
. However, it is very coarse-grained so I expect an additional performance penalty. I thought there should be a lock tied to each atom group but found none.In summary, I think that Colvars should be fundamentally changed to achieve better support of reusable components and parallelization.
This PR tries to solve #232, extends #644 and finishes: