Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT compilation updates and compatibility with f90wrap v0.2.14 #315

Merged
merged 11 commits into from
May 10, 2024

Conversation

reuterbal
Copy link
Collaborator

The latest release of f90wrap includes module names in the name of generated interface routines (introduced in jameskermode/f90wrap#215). This lets them grow quickly beyond the Fortran length limit of 63 characters for names, and gfortran (not even 13.2) does not allow to lift that limitation.

Therefore, I have shorted the module and routine names in the test base where the character limit was exceeded.
In the process, I also introduced the use of tempdir fixtures more widely to ensure clean-up of test files is handled gracefully and the littering of the source directory is reduced.

Copy link

github-actions bot commented May 9, 2024

Documentation for this branch can be viewed at https://sites.ecmwf.int/docs/loki/315/index.html

@reuterbal reuterbal requested a review from mlange05 May 9, 2024 23:24
Copy link
Collaborator

@mlange05 mlange05 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look great, and many thanks for grinding this out. Only comment I'd have is that the tempdir fixture (and possibly similar fixtures used in JIT-based tests) could eventually be moved to a common location (new conftest.py?) to avoid replication, but that is for another PR, I'd say. GTG from me. :shipit:

@reuterbal
Copy link
Collaborator Author

This turned into quite a rabbit hole. Something in the original set of changes seems to have slightly skewed timings in the test execution, presumably due to the increased use of test-local temporary directories. The symptom was then random segfaults when garbage collection was triggered.

The subsequent investigation unearthed a few glaring holes in our JIT infrastructure for tests:

  • The tests of the parametrise transformation accidentally rebuilt the same files at least twice per test, and both loaded as C extensions into the running Python process. Some unfortunate interplay between GC, address offsets and name clashes resulted in segfaults. This has been mitigated by ensuring each file is only compiled once per test.
  • In the process of debugging, I applied the tmp_path fixture more widely to curtail the creation of in-source test files further and avoid aliasing surprises.
  • This has helped to discover a new issue - Surprise! The Obj nodes that are used to build the dependency graph for library builds are cached. The cache index is, however, simply a "name", which defaults to the filepath stem. With our various parametrised tests, that generate variants of files with the same name in different temporary directories, we now hit aliasing issues because the full file path is not taken into account. This needs to be cleaned up properly, potentially using Scheduler infrastructure to derive these dependency graphs. For now, I introduced a cache cleaning mechanism, which is now triggered as part of the builder fixtures after completing a test.
  • To stress-test this further, I have now replaced the tempdir fixture in all tests by tmp_path

I'm afraid, the end result is now a fairly sizeable diff. Sorry!

@reuterbal reuterbal requested a review from mlange05 May 10, 2024 11:29
Copy link

codecov bot commented May 10, 2024

Codecov Report

Attention: Patch coverage is 98.65229% with 5 lines in your changes are missing coverage. Please review.

Project coverage is 95.11%. Comparing base (5db17e5) to head (b4decea).

Files Patch % Lines
loki/expression/tests/test_expression.py 33.33% 2 Missing ⚠️
loki/transformations/transpile/tests/test_sdfg.py 91.30% 2 Missing ⚠️
loki/transformations/tests/test_parametrise.py 96.15% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #315      +/-   ##
==========================================
+ Coverage   95.09%   95.11%   +0.01%     
==========================================
  Files         165      165              
  Lines       35301    35065     -236     
==========================================
- Hits        33571    33351     -220     
+ Misses       1730     1714      -16     
Flag Coverage Δ
lint_rules 96.39% <ø> (ø)
loki 95.08% <98.65%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@mlange05 mlange05 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed quite a handful of changes in the end... 😭 Many thanks for grinding this through, and many apologies for the JIT issues (how did that ever work? 🤔 In any case, very much appreciated and overall reduction in boilerplate is still very nice!

GTG from me! :shipit:

@reuterbal reuterbal changed the title Compatibility with f90wrap v0.2.14 JIT compilation updates and compatibility with f90wrap v0.2.14 May 10, 2024
@reuterbal reuterbal merged commit 5300280 into main May 10, 2024
12 checks passed
@reuterbal reuterbal deleted the nabr-fix-f90wrap branch May 10, 2024 12:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants