Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add missing GPU synchronize #1402

Merged
merged 7 commits into from
Dec 11, 2024
Merged

Add missing GPU synchronize #1402

merged 7 commits into from
Dec 11, 2024

Conversation

marchdf
Copy link
Contributor

@marchdf marchdf commented Dec 11, 2024

Summary

Pull request type

Please check the type of change introduced:

  • Bugfix
  • Feature
  • Code style update (formatting, renaming)
  • Refactoring (no functional changes, no api changes)
  • Build related changes
  • Documentation content changes
  • Other (please describe):

Checklist

The following is included:

  • new unit-test(s)
  • new regression test(s)
  • documentation for new capability

This PR was tested by running:

  • the unit tests
    • on GPU
    • on CPU
  • the regression tests
    • on GPU
    • on CPU

Additional background

Issue Number:

@marchdf marchdf changed the title Add missing GPU syncs Add missing GPU synchronize Dec 11, 2024
@marchdf
Copy link
Contributor Author

marchdf commented Dec 11, 2024

The main ones I am struggling with now are the ones in overset_ops_routines.cpp. For example, populate_psi is being used like this:

for (int lev = 0; lev < nlevels; ++lev) {
        // Thickness used here is user parameter, whatever works best
        const auto dx = (geom[lev]).CellSizeArray();
        const amrex::Real i_th =
            m_relative_length_scale * std::cbrt(dx[0] * dx[1] * dx[2]);

        // Populate approximate signed distance function
        overset_ops::populate_psi(levelset(lev), vof(lev), i_th, m_asdf_tiny);

        gp(lev).setVal(0.0);
        amrex::MultiFab::Copy(gp(lev), gp_noghost(lev), 0, 0, 3, 0);
        gp(lev).FillBoundary(geom[lev].periodicity());

        // Get pseudo-velocity scale, proportional to smallest dx in iblank
        const amrex::Real pvscale_lev =
            overset_ops::calculate_pseudo_velocity_scale(
                iblank_cell(lev), dx, max_pvscale);
        pvscale = std::min(pvscale, pvscale_lev);
    }
    amrex::Gpu::synchronize();

It has a synchronize but there is also a FillBoundary call right after it...

@marchdf
Copy link
Contributor Author

marchdf commented Dec 11, 2024

Apparently the fill boundary call is safe like it is.

@marchdf marchdf requested a review from jrood-nrel December 11, 2024 20:29
@marchdf marchdf marked this pull request as ready for review December 11, 2024 20:36
@jrood-nrel jrood-nrel enabled auto-merge (squash) December 11, 2024 20:38
@jrood-nrel jrood-nrel merged commit dd8f5e8 into Exawind:main Dec 11, 2024
15 checks passed
@marchdf marchdf mentioned this pull request Dec 13, 2024
@marchdf marchdf deleted the missing-syncs branch December 13, 2024 22:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants