Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmarks for measuring assembly time #1216

Open
keileg opened this issue Aug 21, 2024 · 12 comments
Open

Benchmarks for measuring assembly time #1216

keileg opened this issue Aug 21, 2024 · 12 comments
Assignees

Comments

@keileg
Copy link
Contributor

keileg commented Aug 21, 2024

Define and implement a few benchmark models that can be used to measure (improvements in) model assembly time.

Suggested criteria:

  1. There should only be a few (say, 3-4) benchmarks.
  2. The total setup time should be limited to facilitate frequent testing.
  3. There should be a span in geometric complexity (2d and 3d, up to dozens of fractures)
  4. Cover 2-3 sets of physics (ex: fluid flow, HM, possibly THM)
  5. Setting up the models should not be a major effort. Reuse of setups from tests or similar should be feasible.

Other considerations to be made:

  1. Should we place the code in a separate repository? My instinct says yes.
  2. Will we base timing on logging, line_profiler or an external tool like https://github.com/plasma-umass/scalene?

EDIT
While there are nuances in how best to measure various aspects related to multiphysics, it seems clear we want a benchmark mainly dedicated to geometric complexity, keeping the physics simple (that is, mass balance only). The specification of this first step is roughly as follows (some critical thinking and interpretation should be applied):

  1. Test cases: It is natural to use setups from 2d and 3d benchmarks for flow, though it is not a goal to cover all the cases. Some of the setups are already available, see tutorial flow_benchmarks, and more may be available through fracture_sets.py. Note that there are non-trivial aspects of the geometry and boundary conditions for some of the tutorials, EK can give more information (and point to gmsh files that partly resolve these issues).
  2. For each geometry, there should be some flexibility in terms of mesh resolution, possibly also other parameters, but this should not be overdone.
  3. (Partly) decoupled from the specific setups is where to put the code, how to structure it to allow for reuse and to fit with there being other benchmarks (I don't know exactly what this means) etc. The expectation is that this should be kept in mind but not optimized prematurely.
  4. It is also of interest to consider solutions for tracking of performance, including monitoring over time. Again, this is something not to be overengineered at this early stage.
@keileg
Copy link
Contributor Author

keileg commented Aug 22, 2024

A perhaps better approach to design is to identify parts of the framework that should be covered by the framework and use this to define the cases. My suggestion would be:

  1. Operations connected to the md grid structure: Subdomain and mortar projections, matrices from discretizations of elemental differential operators (e.g. divergence and Darcy's law). We may want to cover scaling with the number of subdomains, grid size, and possibly spatial dimension.
  2. Constitutive laws related to fracture deformation, in particular the often deeply nested structure resulting from the more complex laws.
  3. Constitutive laws related to compositional multiphase transport.

These can be implemented as follows:

  1. A single phase flow model, with the following parametrization:
    • Number of subdomains: The third and fourth fracture network from the 2d benchmark, the second (structured) case from the 3d benchmark.
    • Grid resolution: A few fixed grid parameters for each grid.
    • Discretization: Possibly vary between Tpfa or Mpfa, but mainly use tpfa
  2. A (T)HM model with a rich set of constitutive laws enabled. This includes shear dilation, diff-tpfa etc. Variations:
    • Possibly 2d or 3d. Only a few fractures in each
    • Possibly some variations in grid resolution.
  3. Constitutive laws related to multiphase compositional transport. Rich set of constitutive laws, though it is not clear to me what this entails right now.

@keileg
Copy link
Contributor Author

keileg commented Aug 22, 2024

Additional thoughts regarding setup etc. (partly notes to self):

  1. We should put the run scripts in a separate repository.
  2. Performance improvements over time can be tracked by storing key data in local files, with relevant plotting or analysis functionality in the benchmark repo. This of course assumes that the hardware etc. stays fixed, but that is up to the person doing benchmarking.

@IvarStefansson
Copy link
Contributor

What is the advantage(s) of having a separate repo?

@keileg
Copy link
Contributor Author

keileg commented Aug 26, 2024

What is the advantage(s) of having a separate repo?

Cleanliness. But I see we can achieve the same with a somewhat carefully structured application setup.

@keileg
Copy link
Contributor Author

keileg commented Sep 3, 2024

Additional thoughts after discussion in person:

  • As a complement to a full physics benchmark, a more fine-grained test of individual constitutive laws could give useful information. For this, the method for identifying constitutive laws could be relevant. Along the same lines, a test of the content in the model geometry could be useful, in particular for Reimplementation of Ad projection operators #1182.
  • It could also be relevant to parametrize the depth of the Ad trees to test for scalability along this dimension. This can only to some extent be covered by considering different physical models.

The next step is to set up a full example of a benchmark, likely along the lines implementation step 1 above

@Yuriyzabegaev
Copy link
Contributor

Me and @pschultzendorff could not make Scalene work expectedly. Instead, we found a solution based on cProfiler and SnakeViz: https://kirillstrelkov.medium.com/easy-python-profiling-a70cbf699295

The solution provides the following ui for the CPU profiling:
image

@Yuriyzabegaev
Copy link
Contributor

An even better approach that we found:
image

pip install viztracer
viztracer --min_duration 10ms --ignore_c_function --ignore_frozen --max_stack_depth 20 run_benchmarks.py
vizviewer --port 9002 result.json```

@pschultzendorff
Copy link
Contributor

To expand on Yury’s approach: In vizviewer, the standard view displays all method calls in hierarchical order as they occur during runtime. To find the methods with the longest runtime overall, select a timeframe by clicking and dragging in the timeline. This reveals two new views below: a table listing methods sortable by summed, max, min, or average runtime (this has a different measure, likely CPU cycles) and a graph showing hierarchical method calls ordered by total runtime.

@pschultzendorff
Copy link
Contributor

TODO

  • Yury creates a 2D and a 3D poromechanics benchmark
  • Peter writes a script that runs viztracer and vizviewer on a chosen benchmark and allows some simple user input (model, dimension, grid refinement)

@keileg
Copy link
Contributor Author

keileg commented Nov 1, 2024

Nice work, I'm looking forward to see where this is going and to take it into use in in maintenance work planned for the coming weeks/months.

Looking at the original specification (under EDIT), it seems the first three points are well under way. These are also the points that are most useful for benchmarking during maintenance, thus prioritizing this makes sense. When we have reached a satisfactory stage on those, my thinking right now (may change) is to have a look at the fourth item (systematic benchmarking over time) and see if something simple and useful can be done there as well.

It seems clear, though, that the full issue must be addressed in stages, so let's try to keep in mind that at some point we should put this to rest for a while and get back to it after having gained some experience by using the functionality.

@keileg
Copy link
Contributor Author

keileg commented Nov 7, 2024

An additional fracture network is available here

@pschultzendorff pschultzendorff linked a pull request Nov 8, 2024 that will close this issue
13 tasks
@pschultzendorff pschultzendorff mentioned this issue Nov 8, 2024
13 tasks
@pschultzendorff
Copy link
Contributor

An argument against performance tracking with github actions: We have neither a guarantee that github consistently employs the same resources, nor can we find out which resources it employs. In fact, this blog post shows that CPU time of simple benchmarks can vary by a factor of 3.

Together with Yury's remarks on the difficulty of coding such an action, I think it's therefore best if we focus on a local cron job. This should be rather straightforward, and we just have to decide where to save the results and on which machine to run the job.

@keileg keileg removed a link to a pull request Nov 8, 2024
13 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants