-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unformatted (binary) checkpointing #558
Conversation
It seems the main use case of I would also want to know if |
The intention is to support both checkpointing using files and (de)serializing to memory. A copy would be needed for the latter use case, and it would be probably wasteful for the former. Do you expect that the performance overhead would be high enough that it would be worth implementing two different mechanisms? Cereal looks awesome! It would be probably the best choice if we actually had the ability to choose freely about whether to add a dependency. Sadly, I don't think we do. We've been on a similar route with Lepton and the experience is somewhat of a mixed bag. |
I am not quite sure. What I was thinking about is something like 3D metadynamics simulations, and the grid object could be large. But I maybe wrong and perhaps the performance overhead is not a serious problem. I am considering something similar to |
While we cannot use #include <ranges>
#include <vector>
#include <string_view>
#include <cstring>
#include <iostream>
#include <sstream>
int main() {
std::vector<int> a{1, 2, 3};
std::vector<double> b{4.0, 5.0, 6.0};
std::string_view a_sv(reinterpret_cast<char*>(a.data()), a.size() * sizeof(decltype(a)::value_type));
std::string_view b_sv(reinterpret_cast<char*>(b.data()), b.size() * sizeof(decltype(b)::value_type));
auto jv = std::ranges::join_view(std::vector{a_sv, b_sv});
// the following two lines can be lazy evaluated
std::stringstream ss; // can be fstream or other stream
std::copy(jv.begin(), jv.end(), std::ostream_iterator<char>(ss, ""));
std::string out = ss.str();
std::cout << "Copied size: " << out.size() << std::endl;
// compare to the reference
char* ref = (char*)malloc(sizeof(int) * a.size() + sizeof(double) * b.size());
std::memcpy(ref, a.data(), a.size() * sizeof(int));
std::memcpy(ref + a.size() * sizeof(int), b.data(), b.size() * sizeof(double));
std::cout << "Ref size: " << sizeof(int) * a.size() + sizeof(double) * b.size() << std::endl;
const int cmp = std::memcmp(out.data(), ref, out.size());
std::cout << "Compare result: " << cmp << std::endl;
free(ref);
return 0;
} |
The most immediate use case would be checkpointing in GROMACS. As @HubLot can describe better than me, the MDModules interface places certain restrictions at the moment, which make it really difficult to access the checkpoint file stream directly. Copying data is just safer for now (or at least easier to maintain), but in the future a smarter "zero-copy" mechanism as you suggest may be possible if not even recommended. I don't understand the example though: the first part uses |
I think it is possible to use an object of |
Sure, if there is a smarter container that works for C++11 it would be great to try it, especially if we could also use At the moment the more time consuming part of the work is making all classes run their I/O via objects that have similar semantics as STL streams, but are not derived from them. |
@HanatoK Do you know of a container that can provide what you have in mind for C++11? |
If you refer to |
66481ce
to
d89808d
Compare
7e78874
to
fabc95a
Compare
eea14d0
to
565af1c
Compare
@HubLot The code below from the LAMMPS interface can serve as colvars/lammps/src/COLVARS/fix_colvars.cpp Lines 890 to 919 in dc08c90
The additional copy in the reader is because we can't guarantee that LAMMPS will still have that buffer by the time Colvars is able to read its contents (due to deferred initialization). EDIT: A draft for this is now in #584 |
6b889cf
to
d4968f1
Compare
@jhenin The default of the update branch button is a merge commit, I just rebased it. |
…it input state is defined
in addition, a general round of spelling fixes
b4784cb
to
c3058c9
Compare
Rebased on master |
@jhenin I made some more updates to the doc also considering your concerns about terminology. If you are okay can you go ahead and merge? |
This PR introduces a new class
cvm::memory_stream
that has a similar API to STL streams, but in reality is a glorified wrapper tostd::memcpy()
.To minimize the likelihood of inheritance-related issues, the class does not derive the STL stream classes, and instead re-implements select functions only. This means that specific unformatted I/O functions needed to be added to the existing classes, but in most cases the existing code based on stream operators could be converted into templates.
The new implementation is CI tested on all backends by setting the environment variable
COLVARS_BINARY_RESTART
(see the related doc changes). This variable only controls writing of the state file, because the reading function checks for the magic integer before choosing between unformatted and formatted input.LAMMPS restart files now embed the unformatted (binary) Colvars state, but using the formatted state file on the side for just Colvars remains as an option consistent with the existing behavior.
Future improvements could involve writing or reading to/from a file stream directly, instead of a memory buffer (see comments by @HanatoK), however this is currently not allowed by LAMMPS (at least for reading) and GROMACS for the foreseeable future.