-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify purpose of duplicate precipitation inputs #509
Comments
The example LSTM models were trained with NLDAS hourly forcings: "precipitation hourly total (kg/m^2)", more information here: https://ldas.gsfc.nasa.gov/nldas/v2/forcing. However, there may have been some trickery to get one of those two variables to play nice. Looks like @madMatchstick and Scott have made some adjustments since (https://github.com/NOAA-OWP/lstm/blob/d8100a9d4f651d65e81e2e9902f9a3728e9d1049/lstm/bmi_lstm.py#L102). It would be good to re-train a model with the AORC forcing data, so it is consistent. |
To clear the air on this once more, I'll explain the derivation of
the The exact transformation was done via these two functions. rho approximate density of water at temperatureThis function does not consider pressure. def rho(temp):
"""
Calculate water density at temperature
"""
return 999.99399 + 0.04216485*temp - 0.007097451*(temp**2) + 0.00003509571*(temp**3) - 9.9037785E-8*(temp**4) Cumulative to rate transformationThis function computes the RATE at which precipitation occurs, effectively converting def aorc_as_rate(dataFrame):
"""
Convert kg/m^2 -> m/s
"""
if isinstance(dataFrame.index, pd.MultiIndex):
interval = pd.Series(dataFrame.index.get_level_values(0))
else:
interval = pd.Series(dataFrame.index)
interval = ( interval.shift(-1) - interval ) / np.timedelta64(1, 's')
interval.index = dataFrame.index
precip_rate = ( dataFrame['APCP_surface'].shift(-1) / dataFrame['TMP_2maboveground'].apply(rho) ) / interval
precip_rate.name = 'precip_rate'
return precip_rate It is left to the user to decide which of these inputs is appropriate for a given model formulation, or whether or not the assumption that |
While I do appreciate the option to let a user decide which is appropriate for their model formulation, I still think this sort of conversion would be better suited for a BMI-enabled unit conversion, or left to the user to do from their own BMI wrapper. Including only the "official" published (https://hydrology.nws.noaa.gov/aorc-historic/Documents/AORC-Version1.1-SourcesMethodsandVerifications.pdf; Section 5.1: "The gridded AORC precipitation dataset contains one-hour accumulated surface- precipitation (APCP) ending at the top of each hour, in liquid water-equivalent units (kg/m^2 to the nearest .01 kg/m^2)") values could help reduce confusion in these instances. |
@jmframe , Something important that may not be obvious about @hellkite500 's reply above is that none of that code is actually in the model engine... that's just some code that was used to compute one column from the other in some of the data we've used internally and which is in the repo as example data. Also, in the context of the original question... the presence of both in AorcForcing.hpp is mainly just a way to attach units to a few known CSV headers (For which there is no formal convention for representing units)... the code itself lets you bring in any quantity with any header and you can wire a formulation to it either by default name matching or using |
As for the value mapping between the two being unclear, if the above explanation was a little more TL;DR than you were looking for, I'll add this briefer but less completely accurate explanation: precip_rate[t-1] = ( ( APCP_surface[t] / 3600 ) / 1000 ) * density_factor_very_near_one where:
That last bit may depend on the interpretation of precip_rate: is it the rate happening in that timestep hour, or is is the rate that led to the accumulation of precip we measured at this instantaneous hour marker? In either case, I'm guessing that the bump from one timestep to the next was what was causing @jameshalgren to be completely lost as to the relationship between the two columns. |
@jmframe @mattw-nws
Using the equation above to compute the density factor, the temperature must be in degrees Celsius (i.e., the Kelvin value minus 273.15)
The same question about average temperature and instantaneous temperature applies -- which value is applicable to perform the density calculation? (though the answer is nearly inconsequential as the density difference from one hour to the next, even in cases of extraordinary temperature gradients, is less than a percent -- 0 degrees C to 40 degrees C yields 0.8% density change...) Here's my scratch spreadsheet, built from Thanks for the input. The short story is, now we know how to get all the forcing inputs needed for a CFE run from the existing NWM forcing inputs. |
Self closing. We might make this issue:
|
Yep, this is where #389 came from! 😆 |
@JoshCu and should probably update this |
The CSV input files include both
APCP_surface
andprecip_rate
. It would seem these are redundant. The former would be expected to be an accumulated precipitation amount and the latter a rate, and, if the are derived from the same source, should be easily transformed one to the other. But... that doesn't seem to be the case -- there is no obvious linear transformation relating the two values. (We recognize that kg/m2/s is equivalent to mm/s... it isn't a strict unit conversion issue we're seeing here, at least it doesn't appear to be...)What is the meaning of each variable and where are the values used?
Doing some digging, it appears that
precip_rate
is used in the LSTM model andAPCP_surface
may be used in the CFE-based simulations.Possibly useful line references:
ngen/include/forcing/AorcForcing.hpp
Line 19 in f914038
ngen/src/models/lstm/LSTM.cpp
Line 170 in f914038
ngen/models/tshirt/include/tshirt_c.h
Line 53 in f914038
The text was updated successfully, but these errors were encountered: