Skip to content

Commit

Permalink
Merge pull request #135 from mabarnes/restarting-simplification
Browse files Browse the repository at this point in the history
Simplify restart functionality
  • Loading branch information
johnomotani authored Sep 20, 2023
2 parents 6d9da78 + 0049801 commit 808468b
Show file tree
Hide file tree
Showing 10 changed files with 133 additions and 161 deletions.
37 changes: 28 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,13 +32,10 @@ The full documentation is online at [https://mabarnes.github.io/moment_kinetics]
```
this significantly decreases the load time but prevents code changes from taking effect when `moment_kinetics.so` is used without repeating the precompilation (to use this option, add an option `-Jmoment_kinetics.so` when starting julia).
4) To run julia with optimization, type
```
$ julia -O3 --project run_moment_kinetics.jl
```
Default input options are specified in `moment_kinetics_input.jl`. The defaults can be modified for a particular run by setting options in a TOML file, for example `input.toml`, which can be passed as an argument
```
$ julia -O3 --project run_moment_kinetics.jl input.toml
```
Options are specified in a TOML file, e.g. `input.toml` here. The defaults are specified in `moment_kinetics_input.jl`.
* To run in parallel, just put `mpirun -np <n>` in front of the call you would normally use, with `<n>` the number of processes to use.
* It may be more convenient when running `moment_kinetics` more than once to work from the Julia REPL, e.g.
```
Expand All @@ -50,27 +47,49 @@ The full documentation is online at [https://mabarnes.github.io/moment_kinetics]
```
julia> run_moment_kinetics("input.toml")
```
5) To make plots and calculate frequencies/growth rates, run
5) To restart a simulation using `input.toml` from the last time point in the existing run directory,
```
$ julia -O3 --project run_moment_kinetics --restart input.toml
```
or to restart from a specific output file - either from the same run or (if the settings are compatible) a different one - here `runs/example/example.dfns.h5`
```
$ julia -O3 --project run_moment_kinetics input.toml runs/example/example.dfns.h5
```
The output file must include distribution functions. When not using parallel I/O there will be multiple output files from different MPI ranks - any one of these can be passed.
* To do the same from the Julia REPL
```
$ julia -O3 --project
julia> run_moment_kinetics("input.toml", restart=true)
```
or
```
julia> run_moment_kinetics("input.toml", restart="runs/example/example.dfns.h5")
```
* When calling the `run_moment_kinetics()` function you can also choose a particular time index to restart from, e.g.
```
julia> run_moment_kinetics("input.toml", restart="runs/example/example.dfns.h5", restart_time_index=42)
```
6) To make plots and calculate frequencies/growth rates, run
```
$ julia --project run_post_processing.jl runs/<directory to process>
```
passing the directory to process as a command line argument. Input options for post-processing can be specified in post_processing_input.jl.
6) Parameter scans (see [Running parameter scans](#running-parameter-scans)) or performance tests can be performed by running
7) Parameter scans (see [Running parameter scans](#running-parameter-scans)) or performance tests can be performed by running
```
$ julia -O3 --project driver.jl
```
If running a scan, it can be parallelised by passing the number of processors as an argument. Scan options are set in `scan_inputs.jl`.
7) Post processing can be done for several directories at once using
8) Post processing can be done for several directories at once using
```
$ julia --project post_processing_driver.jl runs/<directory1> runs/<directory2> ...
```
passing the directories to process as command line arguments. Optionally pass a number as the first argument to parallelise post processing of different directories. Input options for post-processing can be specified in `post_processing_input.jl`.
8) In the course of development, it is sometimes helpful to upgrade the Julia veriosn. Upgrading the version of Julia or upgrading packages may require a fresh installation of `moment_kinetics`. To make a fresh install with the latest package versions it is necessary to remove (or rename) the `Manifest.jl` file in the main directory, and generate a new `Manifest.jl` with step 1) above. It can sometimes be necessary to remove or rename the `.julia/` folder in your root directory for this step to be successful.
9) In the course of development, it is sometimes helpful to upgrade the Julia veriosn. Upgrading the version of Julia or upgrading packages may require a fresh installation of `moment_kinetics`. To make a fresh install with the latest package versions it is necessary to remove (or rename) the `Manifest.jl` file in the main directory, and generate a new `Manifest.jl` with step 1) above. It can sometimes be necessary to remove or rename the `.julia/` folder in your root directory for this step to be successful.
9) One may have to set an environment variable to avoid error messages from the Qt library. If you execute the command
10) One may have to set an environment variable to avoid error messages from the Qt library. If you execute the command
```
$ julia --project run_post_processing.jl runs/your_run_dir/
Expand Down
2 changes: 1 addition & 1 deletion machines/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ $ ./submit-restart.sh <path to input file>.toml
```
will submit a job to run and post-process a restart using input file. The
simulation will restart from the last time point of the previous run
(`restart_moment_kinetics.jl` supports more flexibility, but for now you would
(`run_moment_kinetics.jl` supports more flexibility, but for now you would
need to write your own submission script to pass the options needed for that).

Default parameters for the runs (number of nodes, time limit, etc.) were set up
Expand Down
2 changes: 1 addition & 1 deletion machines/archer/jobscript-restart.template
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,6 @@ export SRUN_CPUS_PER_TASK=$SLURM_CPUS_PER_TASK

echo "running INPUTFILE $(date)"

srun --distribution=block:block --hint=nomultithread --ntasks=$SLURM_NTASKS bin/julia -Jmoment_kinetics.so --project -O3 --check-bounds=no restart_moment_kinetics.jl INPUTFILE RESTARTFROM
srun --distribution=block:block --hint=nomultithread --ntasks=$SLURM_NTASKS bin/julia -Jmoment_kinetics.so --project -O3 --check-bounds=no run_moment_kinetics.jl --restart INPUTFILE RESTARTFROM

echo "finished INPUTFILE $(date)"
2 changes: 1 addition & 1 deletion machines/marconi/jobscript-restart.template
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,6 @@ source julia.env

echo "running INPUTFILE $(date)"

mpirun -np $SLURM_NTASKS bin/julia -Jmoment_kinetics.so --project -O3 --check-bounds=no restart_moment_kinetics.jl INPUTFILE RESTARTFROM
mpirun -np $SLURM_NTASKS bin/julia -Jmoment_kinetics.so --project -O3 --check-bounds=no run_moment_kinetics.jl --restart INPUTFILE RESTARTFROM

echo "finished INPUTFILE $(date)"
7 changes: 0 additions & 7 deletions restart_moment_kinetics.jl

This file was deleted.

12 changes: 8 additions & 4 deletions src/command_line_options.jl
Original file line number Diff line number Diff line change
Expand Up @@ -16,22 +16,26 @@ const s = ArgParseSettings()
arg_type = String
default = nothing
"restartfile"
help = "Name of NetCDF file to restart from"
help = "Name of output file (HDF5 or NetCDF) to restart from"
arg_type = String
default = nothing
"--debug", "-d"
help = "Set debugging level, default is 0 (no extra debugging). Higher " *
"integer values activate more checks (and increase run time)"
arg_type = Int
default = 0
# Options for tests
"--long"
help = "Include more tests, increasing test run time."
"--restart"
help = "Restart from latest output file in run directory (ignored if " *
"`restartfile` is passed)"
action = :store_true
"--restart-time-index"
help = "Time index in output file to restart from, defaults to final time point"
arg_type = Int
default = -1
# Options for tests
"--long"
help = "Include more tests, increasing test run time."
action = :store_true
"--verbose", "-v"
help = "Print verbose output from tests."
action = :store_true
Expand Down
8 changes: 6 additions & 2 deletions src/communication.jl
Original file line number Diff line number Diff line change
Expand Up @@ -125,8 +125,12 @@ function setup_distributed_memory_MPI(z_nelement_global,z_nelement_local,r_nelem
end
# throw an error if user specified information is inconsistent
if (nrank_per_zr_block*nblocks < nrank_global)
error("ERROR: You must choose global number of processes to be an integer multiple of the number of \n
nblocks = (r_nelement_global/r_nelement_local)*(z_nelement_global/z_nelement_local)")
error("ERROR: You must choose global number of processes to be an integer "
* "multiple of the number of\n"
* "nblocks($nblocks) = (r_nelement_global($r_nelement_global)/"
* "r_nelement_local($r_nelement_local))*"
* "(z_nelement_global($z_nelement_global)/"
* "z_nelement_local($z_nelement_local))")
end

# assign information regarding shared-memory blocks
Expand Down
Loading

0 comments on commit 808468b

Please sign in to comment.