Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use LazyBroadcast in set_velocity_quantities #3608

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

charleskawczynski
Copy link
Member

This PR is a step towards #3594.

Copy link
Member

@szy21 szy21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks mostly correct (although there may be a bug somewhere as some tests failed), but I find it a bit hard to understand how velocity is calculated as it is separated into multiple lines now. Maybe @dennisYatunin can take a look if this is ok?

@szy21 szy21 requested a review from trontrytel February 7, 2025 18:59
@trontrytel
Copy link
Member

Agree with the above. Separating the velocity computation into a couple of separate steps may lead to some errors. Would it be possible to keep the behavior of setting them all together while still switching to lazy broadcasts?

Also a stupid question. Why do we need

        bc_ᶠu³ʲ = compute_ᶠu³(ᶜuₕ, ᶜρ, ᶠu₃ʲ)
        @. ᶠu³ʲ = bc_ᶠu³ʲ

instead of

@. ᶠu³ʲ = compute_ᶠu³(ᶜuₕ, ᶜρ, ᶠu₃ʲ)

@charleskawczynski
Copy link
Member Author

Agree with the above. Separating the velocity computation into a couple of separate steps may lead to some errors. Would it be possible to keep the behavior of setting them all together while still switching to lazy broadcasts?

I suppose we could instead pass the state variable(s), and remove the cached variable and delay inlining of set_velocity_quantities!.

Also a stupid question. Why do we need

        bc_ᶠu³ʲ = compute_ᶠu³(ᶜuₕ, ᶜρ, ᶠu₃ʲ)
        @. ᶠu³ʲ = bc_ᶠu³ʲ

instead of

@. ᶠu³ʲ = compute_ᶠu³(ᶜuₕ, ᶜρ, ᶠu₃ʲ)

That's not a stupid question-- it's because we aren't broadcasting over its arguments ((ᶜuₕ, ᶜρ, ᶠu₃ʲ))-- internally its performing a broadcast operation and returning a broadcasted object. So, it needs to be hoisted.

@dennisYatunin
Copy link
Member

Why do we need

        bc_ᶠu³ʲ = compute_ᶠu³(ᶜuₕ, ᶜρ, ᶠu₃ʲ)
        @. ᶠu³ʲ = bc_ᶠu³ʲ

instead of

@. ᶠu³ʲ = compute_ᶠu³(ᶜuₕ, ᶜρ, ᶠu₃ʲ)

I think what @trontrytel meant here was ᶠu³ʲ .= compute_ᶠu³(ᶜuₕ, ᶜρ, ᶠu₃ʲ). I am definitely in favor of keeping things on one line like this to facilitate readability. Also, I would like to request that we use lazy.( instead of @lazy(, since macro calls are not as beginner-friendly as function calls.

Comment on lines 395 to 415
bc_ᶠu³ = compute_ᶠu³(ᶜuₕ, ᶜρ, ᶠu₃)
bc_ᶜu = compute_ᶜu(ᶜuₕ, ᶠu₃)
bc_ᶜK = compute_kinetic(ᶜuₕ, ᶠu₃)
@. ᶠu³ = bc_ᶠu³
@. ᶜu = bc_ᶜu
@. ᶜK = bc_ᶜK

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bc_ᶠu³ = compute_ᶠu³(ᶜuₕ, ᶜρ, ᶠu₃)
bc_ᶜu = compute_ᶜu(ᶜuₕ, ᶠu₃)
bc_ᶜK = compute_kinetic(ᶜuₕ, ᶠu₃)
@. ᶠu³ = bc_ᶠu³
@. ᶜu = bc_ᶜu
@. ᶜK = bc_ᶜK
set_velocity_quantities!(ᶜu, ᶠu³, ᶜK, Y.f.u₃, Y.c.uₕ, ᶠuₕ³)

I am also strongly opposed to this design pattern of code duplication. We should not allow the need for performance improvements to worsen the readability and modularity of our physics code. If your current strategy for lazy evaluation requires code duplication, it must be redesigned.

What I'm seeing here is equivalent to a single function call to something like set_velocity_quantities!, which internally uses lazy broadcasts. If you are planning to further hoist the lazy expressions up through our chain of computations, you need to do this in a way that avoids manual inlining and duplication of code. I was exploring one way to achieve this in FusibleBroadcasts.jl, but there are definitely other strategies you can try out.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm making a judgement call here, it's 3 lines of code, I don't think this is a big deal. I don't plan on inlining huge functions, at the moment, unless we can avoid massive duplication. Some of these PRs have been adding lines of code, but others have deleted code.

One issue I have with the current state of this function is that you need to look into it to know what is being mutated, which is not a good design pattern either. In addition, on the user side, you cannot perform part of this function, which is precisely the restart issue. We want to increase granularity, and all this function is doing is populating the cache. My hope is that all of these cache pieces will eventually disappear into other functions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to do the same thing twice: on the grid mean and in edmf sub-domains. Keeping it inside one function call guarantees we do the same thing. Splitting it into separate calls can potentially lead to errors where we will change something in the grid mean computation but not in edmf

Copy link
Member

@dennisYatunin dennisYatunin Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To facilitate physics development, locality of similar operations and minimization of code duplication should be one of our top priorities right now. I agree that 3 lines (which are currently 6 lines but it sounds like you'll change it to 3) are not a dealbreaker if they're only used in 3 places (grid-mean, environment, and updrafts), but I'm very concerned about the repercussions of using this design pattern more broadly. If ClimaAtmos becomes full of repeated calls to short lazy functions that each define one variable, it will be more challenging to read and debug. Zhaoyi, Anna, and I really don't want to see our code move in that direction, especially not while we are actively developing and debugging the implicit EDMF model. If we're going to experiment with manual lazification, we need to do it in a way that still allows us to set multiple fields within a single function. As a simple example, it should be possible to design our lazy operations so that set_velocity_quantities can remain a single function. If it isn't, we need a different strategy.

In general, I think we should avoid performance improvements that make ClimaAtmos code harder to read or debug. The appropriate places for such performance improvements are packages like ClimaCore, ClimaTimeSteppers, LazyBroadcasts, FusibleBroadcasts, and UnrolledUtilities. ClimaAtmos should be primarily focused on clarity and simplicity of physics, especially for the duration of our implicit EDMF sprint this month.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine for temporarily leaving the function set_velocity_quantities!, because we can still introduce the other functions and make a smaller, perhaps less controversial step forward.

Reducing the size of the cache, which is part of our initiative to make restarts possible, will require code duplication, as these variables that we currently compute once will need to be computed on-the-fly.

We can put this on pause if priorities have since changed. @trontrytel / @dennisYatunin, are we no longer prioritizing simulation restarts? If so, I can drop it from my todo list.

Please let me know, there's a lot of work to be done, and we don't have a lot of time..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguably, we could make all of these intermediate quantities broadcasted objects, but that could really hurt compilation time, I've been trying to keep their scope local, so that everyone doesn't suffer even longer compilation times.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to discuss with more people. Maybe we can talk about it in the software meeting. This particular PR is short and may be ok, but I'm also concerned that this general pattern may affect the readability of the code. I would also like to understand if this is the only way for restarts to work (which I would think is not true for ClmaAtmos as we can restart now).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine by me, but that's not for another week (from Monday). Should we schedule to meet before then?

I would also like to understand if this is the only way for restarts to work (which I would think is not true for ClmaAtmos as we can restart now).

IIRC, the issue is with restarts in the coupler.

I guess the other question I have is: is this a blocker to the additional issues discussed in #3594?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let’s try to meet early next week. I will send a slack message on Monday.

@charleskawczynski
Copy link
Member Author

I think what @trontrytel meant here was ᶠu³ʲ .= compute_ᶠu³(ᶜuₕ, ᶜρ, ᶠu₃ʲ). I am definitely in favor of keeping things on one line like this to facilitate readability. Also, I would like to request that we use lazy.( instead of @lazy(, since macro calls are not as beginner-friendly as function calls.

Ah, sure, I think that will work. I kind of like the pointer assignment separate from the materialization, but it's not a big deal for now.

And I'm happy to update to using lazy, I think you're right regarding simplicity. I'll open a sweeping change for that soon.

@charleskawczynski
Copy link
Member Author

I've updated to use the x .= f() syntax, cc @dennisYatunin.

@charleskawczynski
Copy link
Member Author

Also, I've opened #3613. I'll rebase this PR after that merges.

@charleskawczynski charleskawczynski force-pushed the ck/set_velocity_quantities branch from a63dcd2 to deea162 Compare February 11, 2025 20:31
@charleskawczynski charleskawczynski force-pushed the ck/set_velocity_quantities branch from deea162 to 6413832 Compare February 12, 2025 16:25
@charleskawczynski charleskawczynski force-pushed the ck/set_velocity_quantities branch from 6413832 to 4c3ed68 Compare February 12, 2025 16:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants