Merge pull request #146 from mabarnes/parameter-scan-update

Update automated parameter scan functionality
mabarnes · Nov 17, 2023 · 01a8f29 · 01a8f29
2 parents 2327e49 + 9b8bde1
commit 01a8f29
Show file tree

Hide file tree

Showing 18 changed files with 359 additions and 287 deletions.
diff --git a/README.md b/README.md
@@ -194,52 +194,14 @@ links to an incompatible libcurl, causing an error. When compiled from source
 dependency), avoiding the problem.
 
 ## Running parameter scans
-Parameter scans can be run, and can (optionally) use multiple processors. Short
-summary of implementation and usage:
-1) `mk_input()` takes a Dict argument, which can modify values. So `mk_input()`
-    sets the 'defaults' (for a scan), which are overridden by any key/value
-    pairs in the Dict.
-2) `mk_scan_inputs()` (in `scan_input.jl`) creates an Array of Dicts that can
-    be passed to `mk_input()`. It first creates a Dict of parameters to scan
-    over (keys are the names of the variable, values are an Array to scan
-    over), then assembles an Array of Dicts (where each entry in the Array is a
-    Dict with a single value for each variable being scanned). Most variables
-    are combined as an 'inner product', e.g. `{:ni=>[0.5, 1.], :nn=>[0.5, 0.]}`
-    gives `[{:ni=>0.5, :nn=>0.5}, {ni=>1., nn=>0.}]`. Any special variables
-    specified in the `combine_outer` array are instead combined with the rest
-    as an 'outer product', i.e. an entry is created for every value of those
-    variables for each entry in the 'inner-producted' list. [This was just
-    complicated enough to run the scans I've done so far without wasted
-    simulations.]
-3) The code in `driver.jl` picks between a single run (normal case), a
-    performance_test, or creating a scan by calling `mk_scan_input()` and then
-    looping over the returned array, calling `mk_input()` and running a
-    simulation for each entry. This loop is parallelised (with the set of
-    simulations dispatched over several processes - each simulation is still
-    running serially). Running a scan (on 12 processes - actually 13 but the
-    'master' process doesn't run any of the loop bodies, so there are 12
-    'workers'):
-    ```
-    $ julia -O3 --project driver.jl 12
-    ```
-    (runs in serial if no argument is given)
-4) The scan puts each run in a separate directory, named with a prefix
-    specified by `base_name` in `scan_input.jl` and the rest the names and
-    values of the scanned-over parameters (the names are created in
-    `mk_scan_input()` too, and passed as the `:run_name` entry of the returned
-    Dicts).
-5) To run `post_processing.analyze_and_plot_data()` over a bunch of directories
-    (again parallelized trivially, and the number of processes to use is an
-    optional argument, serial if omitted):
-    ```
-    $ julia -O3 --project post_processing_driver.jl 12 runs/scan_name_*
-    ```
-6) Plotting the scan is not so general, `plot_comparison.jl` does it, but is
-    only set up for the particular scans I ran - everything except the charge
-    exchange frequencies is hard-coded in.
-    ```
-    $ julia -O3 --project plot_comparison.jl
-    ```
+Parameter scans (see [Parameter scans](@ref)) can be performed by running
+```
+$ julia -O3 --project run_parameter_scan.jl path/to/scan/input.toml
+```
+If running a scan, it can be parallelised by passing the `-p` argument to julia, e.g. to run on 8 processes
+```
+$ julia -p 8 -O3 --project run_parameter_scan.jl path/to/scan/input.toml
+```
 
 ## Tests
 There is a test suite in the `test/` subdirectory. It can be run in a few ways:

diff --git a/docs/src/index.md b/docs/src/index.md
@@ -13,6 +13,7 @@ Pages = ["getting_started.md",
          "moment_constraints_notes.md",
          "boundary_conditions_notes.md",
          "external_sources_notes.md",
+         "parameter_scans.md",
          "developing.md",
          "debugging-hints.md",
         ]

diff --git a/docs/src/parameter_scans.md b/docs/src/parameter_scans.md
@@ -0,0 +1,60 @@
+Parameter scans
+===============
+
+Running a scan
+--------------
+
+Parameter scans can be run using the `run_parameter_scan.jl` script. To run from the REPL
+```julia
+$ julia -p 8 --project -O3
+julia> include("run_parameter_scan.jl")
+julia> run_parameter_scan("path/to/an/input/file.toml")
+```
+or to run a single scan from the command line
+```shell
+$ julia -p 8 --project -O3 run_parameter_scan.jl path/to/an/input/file.toml
+```
+The `-p 8` argument passed to julia in these examples is optional. It indicates
+that julia should use 8 processes for parallelism. In this case we are not
+using MPI - each run in the scan is run in serial, but up to 8 (in this
+example) runs from the scan can be performed simultaneously (using the
+`@distributed` macro).
+
+The runs can use MPI - in this case call julia using `mpirun`, etc. as usual
+but do not pass the `-p` argument. Mixing MPI and `@distributed` would cause
+oversubscription and slow everything down. The runs will run one after the
+other, and each run will be MPI parallelised.
+
+The inputs (see [`moment_kinetics.parameter_scans.get_scan_inputs`](@ref)) can
+be passed to the function in a Dict, or read from a TOML file.
+
+`run_parameter_scan` can also be passed a directory (either as an argument to
+the function or from the command line), in which case it will perform a run for
+every input file contained in that directory.
+
+Post processing a scan
+----------------------
+
+[`moment_kinetics.makie_post_processing.makie_post_process`](@ref) can be
+called for each run in a scan. For example to post process the scan in
+`runs/scan_example` from the REPL
+```julia
+$ julia -p 8 --project -O3
+julia> include("post_process_parameter_scan.jl")
+julia> post_process_parameter_scan("runs/scan_example/")
+```
+or to from the command line
+```shell
+$ julia -p 8 --project -O3 post_process_parameter_scan.jl runs/scan_example/
+```
+Again the `-p 8` argument passed to julia in these examples is optional. It
+indicates that julia should use 8 processes for parallelism. Each run in the
+scan is post-processed in serial, but up to 8 (in this example) runs from the
+scan can be post-processed simultaneously (using the `@distributed` macro).
+
+API
+---
+
+```@autodocs
+Modules = [moment_kinetics.parameter_scans]
+```
diff --git a/docs/src/zz_scan_input.md b/docs/src/zz_scan_input.md
diff --git a/driver.jl b/driver.jl
diff --git a/post_process_parameter_scan.jl b/post_process_parameter_scan.jl
@@ -0,0 +1,23 @@
+using Pkg
+Pkg.activate(".")
+
+using Distributed
+
+@everywhere using moment_kinetics.makie_post_processing: makie_post_process
+
+# get the run_names from the command-line
+function post_process_parameter_scan(scan_dir)
+    run_directories = Tuple(d for d ∈ readdir(scan_dir, join=true) if isdir(d))
+    @sync @distributed for d ∈ run_directories
+        println("post-processing ", d)
+        try
+            makie_post_process(d)
+        catch e
+            println(d, " failed with ", e)
+        end
+    end
+end
+
+if abspath(PROGRAM_FILE) == @__FILE__
+    post_process_parameter_scan(ARGS[1])
+end
diff --git a/post_processing_driver.jl b/post_processing_driver.jl
diff --git a/run_parameter_scan.jl b/run_parameter_scan.jl
@@ -0,0 +1,38 @@
+using Pkg
+Pkg.activate(".")
+
+using Distributed
+
+@everywhere using moment_kinetics
+using moment_kinetics.parameter_scans: get_scan_inputs
+
+"""
+    run_parameter_scan(args...)
+
+Run a parameter scan, getting the inputs for each run from
+[`moment_kinetics.parameter_scans.get_scan_inputs`](@ref).
+
+If MPI is not used (i.e. each run should be run in serial), then `@distributed`
+parallelism can be used to launch several runs at the same time. To do this, start julia
+using the `-p` option to set the number of distributed processes, for example
+```shell
+\$ julia --project -p 8 run_parameter_scan.jl examples/something/scan_foobar.toml
+```
+
+When MPI is used, do not pass the `-p` flag to julia. Each run will run in parallel using
+MPI, and the different runs in the scan will be started one after the other.
+"""
+function run_parameter_scan(args...)
+    scan_inputs = get_scan_inputs(args...)
+
+    @sync @distributed for s ∈ scan_inputs
+        println("running ", s["run_name"])
+        run_moment_kinetics(s)
+    end
+
+    return nothing
+end
+
+if abspath(PROGRAM_FILE) == @__FILE__
+    run_parameter_scan()
+end
diff --git a/scan_input.jl b/scan_input.jl
diff --git a/src/moment_kinetics.jl b/src/moment_kinetics.jl
@@ -56,7 +56,7 @@ include("source_terms.jl")
 include("numerical_dissipation.jl")
 include("load_data.jl")
 include("moment_kinetics_input.jl")
-include("scan_input.jl")
+include("parameter_scans.jl")
 include("analysis.jl")
 include("post_processing_input.jl")
 include("post_processing.jl")
@@ -85,7 +85,7 @@ using .load_data: reload_evolving_fields!
 using .looping
 using .moment_constraints: hard_force_moment_constraints!
 using .looping: debug_setup_loop_ranges_split_one_combination!
-using .moment_kinetics_input: mk_input, read_input_file, run_type, performance_test
+using .moment_kinetics_input: mk_input, read_input_file
 using .time_advance: setup_time_advance!, time_advance!
 using .type_definitions: mk_int
 using .utils: to_minutes
@@ -95,26 +95,26 @@ using .utils: to_minutes
 """
 main function that contains all of the content of the program
 """
-function run_moment_kinetics(to::TimerOutput, input_dict=Dict(); restart=false,
-                             restart_time_index=-1)
+function run_moment_kinetics(to::Union{TimerOutput,Nothing}, input_dict=Dict();
+                             restart=false, restart_time_index=-1)
     mk_state = nothing
     try
         # set up all the structs, etc. needed for a run
         mk_state = setup_moment_kinetics(input_dict; restart=restart,
                                          restart_time_index=restart_time_index)
 
         # solve the 1+1D kinetic equation to advance f in time by nstep time steps
-        if run_type == performance_test
-            @timeit to "time_advance" time_advance!(mk_state...)
-        else
+        if to === nothing
             time_advance!(mk_state...)
+        else
+            @timeit to "time_advance" time_advance!(mk_state...)
         end
 
         # clean up i/o and communications
         # last 3 elements of mk_state are ascii_io, io_moments, and io_dfns
         cleanup_moment_kinetics!(mk_state[end-2:end]...)
 
-        if block_rank[] == 0 && run_type == performance_test
+        if block_rank[] == 0 && to !== nothing
             # Print the timing information if this is a performance test
             display(to)
             println()
@@ -144,8 +144,8 @@ end
 """
 overload which takes a filename and loads input
 """
-function run_moment_kinetics(to::TimerOutput, input_filename::String; restart=false,
-                             restart_time_index=-1)
+function run_moment_kinetics(to::Union{TimerOutput,Nothing}, input_filename::String;
+                             restart=false, restart_time_index=-1)
     return run_moment_kinetics(to, read_input_file(input_filename); restart=restart,
                                restart_time_index=restart_time_index)
 end
@@ -154,7 +154,7 @@ end
 overload with no TimerOutput arguments
 """
 function run_moment_kinetics(input; restart=false, restart_time_index=-1)
-    return run_moment_kinetics(TimerOutput(), input; restart=restart,
+    return run_moment_kinetics(nothing, input; restart=restart,
                                restart_time_index=restart_time_index)
 end
 

diff --git a/src/moment_kinetics_input.jl b/src/moment_kinetics_input.jl
@@ -22,9 +22,6 @@ using ..reference_parameters
 using MPI
 using TOML
 
-@enum RunType single performance_test scan
-const run_type = single
-
 """
 Read input from a TOML file
 """
@@ -523,7 +520,7 @@ function mk_input(scan_input=Dict(); save_inputs_to_txt=false, ignore_MPI=true)
         # Make file to log some information about inputs into.
         # check to see if output_dir exists in the current directory
         # if not, create it
-        isdir(output_dir) || mkdir(output_dir)
+        isdir(output_dir) || mkpath(output_dir)
         io = open_ascii_output_file(string(output_dir,"/",run_name), "input")
     else
         io = devnull