Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collect and print/save timings #276

Merged
merged 13 commits into from
Oct 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 38 additions & 0 deletions docs/src/developing.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,44 @@ typed, which could impact performance by creating code that is not 'type
stable' (i.e. all concrete types are known at compile time).


## Timings

Checking the timings of different parts of the code can be useful to check that
performance problems are not introduced. Excessive allocations can also be a
sign of type instability (or other problems) that could impact performance. To
monitor these things, `moment_kinetics` uses a `TimerOutput` object
[`moment_kinetics.timer_utils.global_timer`](@ref).

The timings and allocation counts from the rank-0 MPI process are printed to
the terminal at the end of a run. The same information is also saved to the
output file as a string for quick reference - one way to view this is
```bash
$ h5dump -d /timing_data/global_timer_string my_output_file.moments.h5
```

More detailed timing information is saved for each MPI rank into subgroups
`rank<i>` of the `timing_data` group in the output file. This information can
be plotted using [`makie_post_processing.timing_data`](@ref). The plots contain
many curves. Filtering out the ones you are not interested in (using the
`include_patterns`, `exclude_patterns`, and/or `ranks` arguments) can help, but
it still may be useful to have interactive plots which show the label and MPI
rank when you hover over a curve. For example
```julia
julia> using makie_post_processing, GLMakie
julia> ri = get_run_info("runs/my_example_run/")
julia> timing_data(ri; interactive_figs=:times);
```
Here `using GLMakie` selects the `Makie` backend that provides interactive
plots, and the `interactive_figs` argument specifies that `timing_data()`
should make an interactive plot (in this case for the execution times).

Lower level timing data, for example timing MPI and linear-algebra calls, can
be enabled by activating 'debug timing'. This can be done by re-defining the
function [`moment_kinetics.timer_utils.timeit_debug_enabled`](@ref) to return
`true` - not the most user-friendly interface (!) but this feature is probably
only needed while developing/profiling/debugging.


## Parallelization

The code is parallelized at the moment using MPI and shared-memory arrays. Arrays representing the pdf, moments, etc. are shared between all processes. Using shared memory means, for example, we can take derivatives along one dimension while parallelising the other for any dimension without having to communicate to re-distribute the arrays. Using shared memory instead of (in future as well as) distributed memory parallelism has the advantage that it is easier to split up the points within each element between processors, giving a finer-grained parallelism which should let the code use larger numbers of processors efficiently.
Expand Down
6 changes: 6 additions & 0 deletions docs/src/zz_timer_utils.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
`timer_utils`
=============

```@autodocs
Modules = [moment_kinetics.timer_utils]
```
2 changes: 2 additions & 0 deletions makie_post_processing/makie_post_processing/Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,9 @@ Combinatorics = "861a8166-3701-5b0c-9a16-15d98fcdc6aa"
LaTeXStrings = "b964fa9f-0449-5b57-a5c2-d3ea65f4040f"
LsqFit = "2fda8390-95c7-5789-9bda-21331edee243"
MPI = "da04e1cc-30fd-572f-bb4f-1f8673147195"
Makie = "ee78f7c6-11fb-53f2-987a-cfe4a2b5a57a"
NaNMath = "77ba4419-2d1f-58cd-9bb1-8ffee604a2e3"
OrderedCollections = "bac558e1-5e72-5ebc-8fee-abe8a469f55d"
StatsBase = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"
TOML = "fa267f1f-6049-4f14-aa54-33bafae1ed76"
moment_kinetics = "b5ff72cc-06fc-4161-ad14-dba1c22ed34e"
Loading