-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use LazyBroadcast in set_velocity_quantities #3608
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks mostly correct (although there may be a bug somewhere as some tests failed), but I find it a bit hard to understand how velocity is calculated as it is separated into multiple lines now. Maybe @dennisYatunin can take a look if this is ok?
Agree with the above. Separating the velocity computation into a couple of separate steps may lead to some errors. Would it be possible to keep the behavior of setting them all together while still switching to lazy broadcasts? Also a stupid question. Why do we need
instead of
|
I suppose we could instead pass the state variable(s), and remove the cached variable and delay inlining of
That's not a stupid question-- it's because we aren't broadcasting over its arguments ( |
I think what @trontrytel meant here was |
src/cache/precomputed_quantities.jl
Outdated
bc_ᶠu³ = compute_ᶠu³(ᶜuₕ, ᶜρ, ᶠu₃) | ||
bc_ᶜu = compute_ᶜu(ᶜuₕ, ᶠu₃) | ||
bc_ᶜK = compute_kinetic(ᶜuₕ, ᶠu₃) | ||
@. ᶠu³ = bc_ᶠu³ | ||
@. ᶜu = bc_ᶜu | ||
@. ᶜK = bc_ᶜK | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bc_ᶠu³ = compute_ᶠu³(ᶜuₕ, ᶜρ, ᶠu₃) | |
bc_ᶜu = compute_ᶜu(ᶜuₕ, ᶠu₃) | |
bc_ᶜK = compute_kinetic(ᶜuₕ, ᶠu₃) | |
@. ᶠu³ = bc_ᶠu³ | |
@. ᶜu = bc_ᶜu | |
@. ᶜK = bc_ᶜK | |
set_velocity_quantities!(ᶜu, ᶠu³, ᶜK, Y.f.u₃, Y.c.uₕ, ᶠuₕ³) |
I am also strongly opposed to this design pattern of code duplication. We should not allow the need for performance improvements to worsen the readability and modularity of our physics code. If your current strategy for lazy evaluation requires code duplication, it must be redesigned.
What I'm seeing here is equivalent to a single function call to something like set_velocity_quantities!
, which internally uses lazy broadcasts. If you are planning to further hoist the lazy expressions up through our chain of computations, you need to do this in a way that avoids manual inlining and duplication of code. I was exploring one way to achieve this in FusibleBroadcasts.jl, but there are definitely other strategies you can try out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm making a judgement call here, it's 3 lines of code, I don't think this is a big deal. I don't plan on inlining huge functions, at the moment, unless we can avoid massive duplication. Some of these PRs have been adding lines of code, but others have deleted code.
One issue I have with the current state of this function is that you need to look into it to know what is being mutated, which is not a good design pattern either. In addition, on the user side, you cannot perform part of this function, which is precisely the restart issue. We want to increase granularity, and all this function is doing is populating the cache. My hope is that all of these cache pieces will eventually disappear into other functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have to do the same thing twice: on the grid mean and in edmf sub-domains. Keeping it inside one function call guarantees we do the same thing. Splitting it into separate calls can potentially lead to errors where we will change something in the grid mean computation but not in edmf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To facilitate physics development, locality of similar operations and minimization of code duplication should be one of our top priorities right now. I agree that 3 lines (which are currently 6 lines but it sounds like you'll change it to 3) are not a dealbreaker if they're only used in 3 places (grid-mean, environment, and updrafts), but I'm very concerned about the repercussions of using this design pattern more broadly. If ClimaAtmos becomes full of repeated calls to short lazy functions that each define one variable, it will be more challenging to read and debug. Zhaoyi, Anna, and I really don't want to see our code move in that direction, especially not while we are actively developing and debugging the implicit EDMF model. If we're going to experiment with manual lazification, we need to do it in a way that still allows us to set multiple fields within a single function. As a simple example, it should be possible to design our lazy operations so that set_velocity_quantities
can remain a single function. If it isn't, we need a different strategy.
In general, I think we should avoid performance improvements that make ClimaAtmos code harder to read or debug. The appropriate places for such performance improvements are packages like ClimaCore, ClimaTimeSteppers, LazyBroadcasts, FusibleBroadcasts, and UnrolledUtilities. ClimaAtmos should be primarily focused on clarity and simplicity of physics, especially for the duration of our implicit EDMF sprint this month.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine for temporarily leaving the function set_velocity_quantities!
, because we can still introduce the other functions and make a smaller, perhaps less controversial step forward.
Reducing the size of the cache, which is part of our initiative to make restarts possible, will require code duplication, as these variables that we currently compute once will need to be computed on-the-fly.
We can put this on pause if priorities have since changed. @trontrytel / @dennisYatunin, are we no longer prioritizing simulation restarts? If so, I can drop it from my todo list.
Please let me know, there's a lot of work to be done, and we don't have a lot of time..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arguably, we could make all of these intermediate quantities broadcasted objects, but that could really hurt compilation time, I've been trying to keep their scope local, so that everyone doesn't suffer even longer compilation times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to discuss with more people. Maybe we can talk about it in the software meeting. This particular PR is short and may be ok, but I'm also concerned that this general pattern may affect the readability of the code. I would also like to understand if this is the only way for restarts to work (which I would think is not true for ClmaAtmos as we can restart now).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine by me, but that's not for another week (from Monday). Should we schedule to meet before then?
I would also like to understand if this is the only way for restarts to work (which I would think is not true for ClmaAtmos as we can restart now).
IIRC, the issue is with restarts in the coupler.
I guess the other question I have is: is this a blocker to the additional issues discussed in #3594?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let’s try to meet early next week. I will send a slack message on Monday.
Ah, sure, I think that will work. I kind of like the pointer assignment separate from the materialization, but it's not a big deal for now. And I'm happy to update to using |
96f2929
to
a63dcd2
Compare
I've updated to use the |
Also, I've opened #3613. I'll rebase this PR after that merges. |
a63dcd2
to
deea162
Compare
deea162
to
6413832
Compare
6413832
to
4c3ed68
Compare
This PR is a step towards #3594.