You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1 - Detailed description of problem or enhancement
While testing combinations of ranks and threads, checkpoints files are not generated in some cases.
2 - Describe how to reproduce the issue
Test case is a 2 component system that simulates for 10 us.
Using --checkpoint-sim-period=1us we see checkpoints generated as expected for these cases:
1 rank, 1 thread
1 rank, 2 threads
1 rank, 3 threads
2 ranks, 1 thread
However, when we use 2 ranks, 2 threads we see no checkpoint files at all along with the warning message 'no components assigned to rank 1.0 and 1.1'.
As a side request, there appears to be 1 simulation checkpoint log message for each thread. For 3 threads, for example: # Simulation Checkpoint: Simulated Time 9 us (Real CPU time since last checkpoint 0.01255 seconds)
# Simulation Checkpoint: Simulated Time 9 us (Real CPU time since last checkpoint 0.01257 seconds)
# Simulation Checkpoint: Simulated Time 9 us (Real CPU time since last checkpoint 0.01260 seconds)
# Simulation Checkpoint: Simulated Time 10 us (Real CPU time since last checkpoint 0.01275 seconds)
# Simulation Checkpoint: Simulated Time 10 us (Real CPU time since last checkpoint 0.01278 seconds)
# Simulation Checkpoint: Simulated Time 10 us (Real CPU time since last checkpoint 0.01288 seconds)
However, there is only 1 log message per rank when using multiple ranks. I would be helpful to reduce the log file size by only producing 1 message per checkpoint regardless of the number of threads and ranks.
3 - What Operating system(s) and versions
All
4 - What versions of external libraries (MPI, etc.)
mpirun (Open MPI) 4.1.2
5 - Provide sha1 of all relevant SST repositories (sst-core, sst-elements, etc)
sst-core
The text was updated successfully, but these errors were encountered:
1 - Detailed description of problem or enhancement
While testing combinations of ranks and threads, checkpoints files are not generated in some cases.
2 - Describe how to reproduce the issue
Test case is a 2 component system that simulates for 10 us.
Using --checkpoint-sim-period=1us we see checkpoints generated as expected for these cases:
However, when we use 2 ranks, 2 threads we see no checkpoint files at all along with the warning message 'no components assigned to rank 1.0 and 1.1'.
As a side request, there appears to be 1 simulation checkpoint log message for each thread. For 3 threads, for example: # Simulation Checkpoint: Simulated Time 9 us (Real CPU time since last checkpoint 0.01255 seconds)
# Simulation Checkpoint: Simulated Time 9 us (Real CPU time since last checkpoint 0.01257 seconds)
# Simulation Checkpoint: Simulated Time 9 us (Real CPU time since last checkpoint 0.01260 seconds)
# Simulation Checkpoint: Simulated Time 10 us (Real CPU time since last checkpoint 0.01275 seconds)
# Simulation Checkpoint: Simulated Time 10 us (Real CPU time since last checkpoint 0.01278 seconds)
# Simulation Checkpoint: Simulated Time 10 us (Real CPU time since last checkpoint 0.01288 seconds)
However, there is only 1 log message per rank when using multiple ranks. I would be helpful to reduce the log file size by only producing 1 message per checkpoint regardless of the number of threads and ranks.
3 - What Operating system(s) and versions
All
4 - What versions of external libraries (MPI, etc.)
mpirun (Open MPI) 4.1.2
5 - Provide sha1 of all relevant SST repositories (sst-core, sst-elements, etc)
sst-core
The text was updated successfully, but these errors were encountered: