Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify purpose of duplicate precipitation inputs #509

Closed
jameshalgren opened this issue Apr 10, 2023 · 9 comments
Closed

Clarify purpose of duplicate precipitation inputs #509

jameshalgren opened this issue Apr 10, 2023 · 9 comments

Comments

@jameshalgren
Copy link

jameshalgren commented Apr 10, 2023

The CSV input files include both APCP_surface and precip_rate. It would seem these are redundant. The former would be expected to be an accumulated precipitation amount and the latter a rate, and, if the are derived from the same source, should be easily transformed one to the other. But... that doesn't seem to be the case -- there is no obvious linear transformation relating the two values. (We recognize that kg/m2/s is equivalent to mm/s... it isn't a strict unit conversion issue we're seeing here, at least it doesn't appear to be...)

What is the meaning of each variable and where are the values used?

Doing some digging, it appears that precip_rate is used in the LSTM model and APCP_surface may be used in the CFE-based simulations.

Possibly useful line references:

#define AORC_FIELD_NAME_PRECIP_RATE "precip_rate"

forcing[0][0] = lstm_model::normalize("Precip_rate", precip_meters_per_second);

float precip_kg_per_m2; // Surface precipitation "kg/m^2" | APCP_surface

@jmframe
Copy link
Contributor

jmframe commented Apr 11, 2023

The example LSTM models were trained with NLDAS hourly forcings: "precipitation hourly total (kg/m^2)", more information here: https://ldas.gsfc.nasa.gov/nldas/v2/forcing. However, there may have been some trickery to get one of those two variables to play nice. Looks like @madMatchstick and Scott have made some adjustments since (https://github.com/NOAA-OWP/lstm/blob/d8100a9d4f651d65e81e2e9902f9a3728e9d1049/lstm/bmi_lstm.py#L102). It would be good to re-train a model with the AORC forcing data, so it is consistent.

@hellkite500
Copy link
Member

hellkite500 commented Apr 11, 2023

To clear the air on this once more, I'll explain the derivation of precip_rate from APCP_surface.

APCP_surface is the AORC variable in units of kg/m2. This is often ASSUMED to be ~= mm/hr, but that is an assumption that depends ultimately on the density of water and the inferred time unit of the input.

the precip_rate variable was used in some of the early model testing as an actual rate, in units of m/s valid until the next entry (hour, in this case) in the time series.

The exact transformation was done via these two functions.

rho approximate density of water at temperature

This function does not consider pressure.

def rho(temp): 
    """
        Calculate water density at temperature
    """
    return 999.99399 + 0.04216485*temp - 0.007097451*(temp**2) + 0.00003509571*(temp**3) - 9.9037785E-8*(temp**4) 

Cumulative to rate transformation

This function computes the RATE at which precipitation occurs, effectively converting kg/m^2 to m/s by using the AORC interval of 1 hour between inputs (the shift) in seconds, and rho.

def aorc_as_rate(dataFrame):
    """
        Convert kg/m^2 -> m/s
    """
    if isinstance(dataFrame.index, pd.MultiIndex):
        interval = pd.Series(dataFrame.index.get_level_values(0))
    else:
        interval = pd.Series(dataFrame.index)
    interval = ( interval.shift(-1) - interval ) / np.timedelta64(1, 's')
    interval.index = dataFrame.index
    precip_rate = ( dataFrame['APCP_surface'].shift(-1) / dataFrame['TMP_2maboveground'].apply(rho) ) / interval
    precip_rate.name = 'precip_rate'
    return precip_rate

It is left to the user to decide which of these inputs is appropriate for a given model formulation, or whether or not the assumption that kg/m^2 == mm/hr is a reasonable one for their application.

@jmframe
Copy link
Contributor

jmframe commented Apr 11, 2023

While I do appreciate the option to let a user decide which is appropriate for their model formulation, I still think this sort of conversion would be better suited for a BMI-enabled unit conversion, or left to the user to do from their own BMI wrapper. Including only the "official" published (https://hydrology.nws.noaa.gov/aorc-historic/Documents/AORC-Version1.1-SourcesMethodsandVerifications.pdf; Section 5.1: "The gridded AORC precipitation dataset contains one-hour accumulated surface- precipitation (APCP) ending at the top of each hour, in liquid water-equivalent units (kg/m^2 to the nearest .01 kg/m^2)") values could help reduce confusion in these instances.

@mattw-nws
Copy link
Contributor

While I do appreciate the option to let a user decide which is appropriate for their model formulation, I still think this sort of conversion would be better suited for a BMI-enabled unit conversion, or left to the user to do from their own BMI wrapper.

@jmframe , Something important that may not be obvious about @hellkite500 's reply above is that none of that code is actually in the model engine... that's just some code that was used to compute one column from the other in some of the data we've used internally and which is in the repo as example data.

Also, in the context of the original question... the presence of both in AorcForcing.hpp is mainly just a way to attach units to a few known CSV headers (For which there is no formal convention for representing units)... the code itself lets you bring in any quantity with any header and you can wire a formulation to it either by default name matching or using variables_names_map.

@mattw-nws
Copy link
Contributor

mattw-nws commented Apr 11, 2023

As for the value mapping between the two being unclear, if the above explanation was a little more TL;DR than you were looking for, I'll add this briefer but less completely accurate explanation:

precip_rate[t-1] = ( ( APCP_surface[t] / 3600 ) / 1000 ) * density_factor_very_near_one

where:

  • 3600 = seconds in one hour
  • 1000 = millimeters in a meter (precip_rate column units is m/s, not mm/s)
  • t-1 means the previous timestep because the accumulation in APCP_surface is measured at the end of the hour, so the precip_rate is the rate of precip that was happening in the previous hour.

That last bit may depend on the interpretation of precip_rate: is it the rate happening in that timestep hour, or is is the rate that led to the accumulation of precip we measured at this instantaneous hour marker?

In either case, I'm guessing that the bump from one timestep to the next was what was causing @jameshalgren to be completely lost as to the relationship between the two columns.

@jameshalgren
Copy link
Author

jameshalgren commented Apr 11, 2023

@jmframe @mattw-nws
Good points regarding the duplication -- my original question was about the input files. We can log a todo out of this issue to either clean up the input files or clarify the documentation or both.

density_factor_very_near_one

Using the equation above to compute the density factor, the temperature must be in degrees Celsius (i.e., the Kelvin value minus 273.15)

is it the rate happening in that timestep hour, or is is the rate that led to the accumulation of precip we measured at this instantaneous hour marker?

The same question about average temperature and instantaneous temperature applies -- which value is applicable to perform the density calculation? (though the answer is nearly inconsequential as the density difference from one hour to the next, even in cases of extraordinary temperature gradients, is less than a percent -- 0 degrees C to 40 degrees C yields 0.8% density change...)

Here's my scratch spreadsheet, built from cat-52_2015-12-01 00_00_00_2015-12-30 23_00_00.csv to see if I understood (the reference going off image to the top is a cell with the 273.15 constant value.):
image

Thanks for the input. The short story is, now we know how to get all the forcing inputs needed for a CFE run from the existing NWM forcing inputs.

@jameshalgren
Copy link
Author

jameshalgren commented Apr 11, 2023

Self closing. We might make this issue:

  • Create issue regarding simplification of in-repo input files to retain only one value of precipitation, with in-line conversion capability ported to a BMI or other function.

@mattw-nws
Copy link
Contributor

Using the equation above to compute the density factor, the temperature must be in degrees Celsius...

Yep, this is where #389 came from! 😆

@jameshalgren
Copy link
Author

jameshalgren commented Dec 22, 2023

@JoshCu
Also probably related to:
NOAA-OWP/cfe#29

and should probably update this
#302

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants