Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WW3 ICs are not read for HR like experiements. #3109

Open
JessicaMeixner-NOAA opened this issue Nov 18, 2024 · 9 comments · May be fixed by #3112
Open

WW3 ICs are not read for HR like experiements. #3109

JessicaMeixner-NOAA opened this issue Nov 18, 2024 · 9 comments · May be fixed by #3112
Assignees
Labels
bug Something isn't working

Comments

@JessicaMeixner-NOAA
Copy link
Contributor

What is wrong?

The gfs_stage_ic job copies wave ICs:

^[[38;21m2024-11-17 13:45:38,610 - INFO - file_utils : Copied /lfs/h2/emc/couple/noscrub/jessica.meixner/WaveUglo15km/ICDIR/Opt3/gfs.20200912/18/model/wave/restart/20200913.000000.restart.ww3 to /lfs/h2/emc/couple/noscrub/jessica.meixner/WaveUglo15km/Opt3Sept/COMROOT/Opt3Sept/gfs.20200912/18//model/wave/restart^[[0m

But then the forecast job does not find ICs:

+ exglobal_forecast.sh[124]: WW3_postdet
+ forecast_postdet.sh[327]: echo 'SUB WW3_postdet: Linking input data for WW3'
SUB WW3_postdet: Linking input data for WW3
+ forecast_postdet.sh[329]: local ww3_grid first_ww3_restart_out ww3_restart_file
+ forecast_postdet.sh[331]: [[ .false. == \.\t\r\u\e\. ]]
+ forecast_postdet.sh[357]: echo 'WW3 will start from rest!'
WW3 will start from rest!
+ forecast_postdet.sh[358]: first_ww3_restart_out=2020091300
+ forecast_postdet.sh[362]: local ww3_restart_file
+ forecast_postdet.sh[364]: (( vdate = first_ww3_restart_out ))
+ forecast_postdet.sh[364]: (( vdate <= forecast_end_cycle ))
+ forecast_postdet.sh[366]: ww3_restart_file=20200913.000000.restart.ww3
+ forecast_postdet.sh[367]: /bin/ln -sf /lfs/h2/emc/stmp/jessica.meixner/RUNDIRS/Opt3Sept/gfs.2020091300/gfsfcst.2020091300/restart/WW3_RESTART/20200913.000000.restart.ww3 20200913.000000.restart.ww3

What should have happened?

ICs should be used in the HR prototype-like cases.

What machines are impacted?

All or N/A

What global-workflow hash are you using?

Techncially it's a branch, but this also appears to have happend in HR4 tag

Steps to reproduce

I'm running my branch here: https://github.com/JessicaMeixner-NOAA/global-workflow/tree/feature/uglo_15km
and staging my own IC directory.

So from: /lfs/h2/emc/couple/noscrub/jessica.meixner/WaveUglo15km/global-workflow/workflow
using the script coupled.sh to set up experiments.

Additional information

I checked HR4 output and there are no wave ICs as HS is 0 at the first output (haven't confirmed from log though).

Have pinged @sbanihash to be sure to check GEFS recent experiment results.

Have not yet checked low-res Ci tests, but there's likely an example use-case that could be used for debugging.

Do you have a proposed solution?

Trying to figure out logic https://github.com/NOAA-EMC/global-workflow/blob/develop/ush/forecast_postdet.sh#L330-L359 and in the stage_ic job to figure out where the disconnect is.

@JessicaMeixner-NOAA JessicaMeixner-NOAA added bug Something isn't working triage Issues that are triage labels Nov 18, 2024
@JessicaMeixner-NOAA
Copy link
Contributor Author

It appears that C48_S2SW CI test will demonstrate the issue.

@JessicaMeixner-NOAA
Copy link
Contributor Author

Okay - so running the C48_S2SW CI test can replicate this issue some log files here:
/scratch1/NCEPDEV/climate/Jessica.Meixner/WaveICIssue/test01/COMROOT/test01/logs/2021032312

Stage_IC does:

^[[38;21m2024-11-18 19:43:29,513 - INFO - file_utils : Copied /scratch1/NCEPDEV/global/glopara/data/ICSDIR/C48mx500/20240610/gfs.20210323/06/model/wave/restart/20210323.120000.restart.uglo_100km to /scratch1/NCEPDEV/climate/Jessica.Meixner/WaveICIssue/test01/COMROOT/test01/gfs.20210323/06//model/wave/restart/20210323.120000.restart.ww3^[[0m

Fcst job goes into this section, it's a cold start and then just simply states:
https://github.com/NOAA-EMC/global-workflow/blob/develop/ush/forecast_postdet.sh#L356-L359
"echo "WW3 will start from rest!""

However, many times we have staged IC, so I don't quite understand this logic or when this changed happened. Still trying to go back and understand that.

@aerorahul @WalterKolczynski-NOAA @KateFriedman-NOAA or others are you aware of why this is doing this?

Some PRs of potential reference:
#2510
#3009
And the changes in stage_ic.

For GEFS @sbanihash @NeilBarton-NOAA is there a CI test that has staged wave ICs that I can run to make sure I don't break that with these updates.

@NeilBarton-NOAA
Copy link
Contributor

@JessicaMeixner-NOAA The GEFS CI test is at https://github.com/NOAA-EMC/global-workflow/blob/develop/ci/cases/pr/C96_S2SWA_gefs_replay_ics.yaml

@WalterKolczynski-NOAA
Copy link
Contributor

It was changed in #2510 when Rahul overhauled the entire set of forecast scripts. I think I actually fixed it in #3009, but neglected to remove the (I think) now-erroneous message.

@WalterKolczynski-NOAA WalterKolczynski-NOAA removed the triage Issues that are triage label Nov 19, 2024
@JessicaMeixner-NOAA
Copy link
Contributor Author

It was changed in #2510 when Rahul overhauled the entire set of forecast scripts. I think I actually fixed it in #3009, but neglected to remove the (I think) now-erroneous message.

@WalterKolczynski-NOAA - where are the ICs copied then? I checked the actual output of the wave model and there is no wave IC being used, so it's not just an erroneous error message -- this is a huge problem. In discussion with @NeilBarton-NOAA @sbanihash - we think it'd be a good idea to let the user chose if they want to error out if there is no wave IC or not and control that by a flag so that people can perhaps have an easier way of knowing about this. I think GEFS is okay b/c it's a warm start versus the tests I'm using which are cold starts which put you in different parts of the loop, but we'll continue to check.

@WalterKolczynski-NOAA
Copy link
Contributor

Okay, I see the problem now. Should be an easy fix.

@WalterKolczynski-NOAA WalterKolczynski-NOAA self-assigned this Nov 19, 2024
@JessicaMeixner-NOAA
Copy link
Contributor Author

@WalterKolczynski-NOAA - I've been working on a fix for this, however it'll be great if you take this over. If you will be working on this, can you let me know what your tlimeline for a fix will be?

@WalterKolczynski-NOAA
Copy link
Contributor

@WalterKolczynski-NOAA - I've been working on a fix for this, however it'll be great if you take this over. If you will be working on this, can you let me know what your tlimeline for a fix will be?

I'm going to do it today.

@JessicaMeixner-NOAA
Copy link
Contributor Author

@WalterKolczynski-NOAA great thanks!

Just to share where I was at, I have local changes here: /scratch1/NCEPDEV/climate/Jessica.Meixner/WaveICIssue/global-workflow which I tested here: /scratch1/NCEPDEV/climate/Jessica.Meixner/WaveICIssue/test02 and they did not work because this did not match the stage_ic location. I'm not sure if this means that the stage IC job should also get updated to be more in sync w/where a "warm" start IC would be expected or if I did not update things correctly on my end or made another mistake -but thought I'd share where I was at.

WalterKolczynski-NOAA added a commit to WalterKolczynski-NOAA/global-workflow that referenced this issue Nov 19, 2024
Wave restart files were not being copied into the run directory for
cold starts. Additionally, the previous restart directory used as
the source for wave restarts (for non-RERUN) was always looking to
the gdas RUN for gfs runs, which I do not believe is correct for
waves since there is no DA.

Resolves NOAA-EMC#3109
WalterKolczynski-NOAA added a commit to WalterKolczynski-NOAA/global-workflow that referenced this issue Nov 19, 2024
The stage job was incorrectly putting wave restarts into the gfs
directory. The forecast job looks for them in the gdas directory,
so this is updated.

Additionally, the restarts were also not being copied from the
staged directory to `$DATA`, so now they are. The process is
identical to that of non-RERUN warm starts, so the code is re-
factored a bit to avoid duplication.

Resolves NOAA-EMC#3109
WalterKolczynski-NOAA added a commit to WalterKolczynski-NOAA/global-workflow that referenced this issue Nov 20, 2024
Adds a new vesion file for IC directories. Unlike other version
files, this one uses an associative array instead of different
variables.

With the version file in place, the versions are updated on most
of the directories to switch to the relocated wave restarts.

Refs: NOAA-EMC#3109
WalterKolczynski-NOAA added a commit to WalterKolczynski-NOAA/global-workflow that referenced this issue Nov 20, 2024
Adds a new vesion file for IC directories. Unlike other version
files, this one uses an associative array instead of different
variables.

With the version file in place, the versions are updated on most
of the directories to switch to the relocated wave restarts.

Refs: NOAA-EMC#3109
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants