Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add docs for ARC4 compile+run with CASIM/SOCRATES #46

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

leifdenby
Copy link
Collaborator

@eers1 and @cemac-ccs I have worked through the word-document field guide for ARC4 that @cemac-ccs, @MarkUoLeeds, @craigpoku, @sjboeing and I wrote and turned it into a file to include with the MONC source-code. Please have a read through (and if you have time, a work through the steps). I changed the structure a little to reflect that the MONC source-code should be fetched from our fork on github rather than MOSRS.

It might be easier to view the document as rendered here: https://github.com/leifdenby/monc/blob/monc-arc4-casim-socrates/doc/ARC4.md

@leifdenby leifdenby requested a review from cemac-ccs April 23, 2021 15:33
@leifdenby leifdenby requested a review from eers1 April 26, 2021 11:19
Copy link
Collaborator

@eers1 eers1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comments about current script names, paths, commands and versions that I did differently.

doc/ARC4.md Outdated

Notes on versions:

- gnu version: At time of writing Craig Poku's copy of MONC only works with `gnu/4.4.7`, but changes for `gnu/8.3.0`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The gnu version 4.4.7 wasn't available to me, but it worked with 8.3.0 and I have previously compiled with just "module switch intel gnu". I guess that maybe uses the most recent version?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On arc4, the version listed as gnu/native is gnu 4.8.5. Arc2 used gnu 4.4.7 hence the version naming in the fcm make config file which was used to inform this line.
The module command should read module switch intel gnu, as 4.8.5 is the default on arc4, and references should refer to version 4.8.5 or gnu/native

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks both. I thought that you were saying that master on our fork currently doesn't work with gnu version 4.8.x? And that those changes for gnu 4.8.x are coming in as part of your pull-request for ARCHER2 support?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, master doesn't work for gnu >= 7. All gnu 4.x.x will work with the current code

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cemac-ccs could you check that we've written now makes sense please? Thank you


You need to place the `casim`/`socrates`/`casim_socrates` fcm make config file _after_ the MONC file.

You can change which versions of CASIM/SOCRATES are fetched from MOSRS by changing the `casim_revision` and `socrates_revision` variables in the `.cfg`-files. At time of writing the versions to use with MONC `0.9.0` are revision `um10.8` for SOCRATES and revision `6341` for CASIM. Later versions may require changes to MONC for compatibility.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The keyword "um10.8" didn't work for me, so I just used the version Chris had been using, revision 658.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I'll put 658 down as the revision to use instead. Does that mean you edited fcm-make/socrates.cfg. If you do I think we should include that change

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! I was compiling with fcm-make/casim_socrates.cfg and just changed:
$socrates_revision{?} = um10.8
to
$socrates_revision{?} = 658

@cemac-ccs just checking that it was socrates 658 that you have been using?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually revision 358 (which is what get checked out when you use the um10.8 flag). I might have mispoken on our call last week. Looking at the casim_socrates_mirror.cfg file however, it seems that 593 is also a good revision to use.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so is the consensus that revision we should go with is 593? Or should we use the even newer 658 revision?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have confirmed that it compiles with casim 6431 and socrates 658. I am just confirming that it runs on arc4 with these.
I should note though that 658 was a complete mix up. Both the default in the files and the version Craig was using were the um10.8 flagged version, which is 358. I know for a fact that version 855 doesn't work because of more recent code changes, so using too new a version could cause problems.

number_q_fields=9
```

3. External files providing reference profiles for radiation calculations (SOCRATES) and microphysics (CASIM). Craig Poku noted that providing _relative paths_ for these files based on where CASIM/SOCRATES is checked out within the MONC source tree (when fcm is used to fetch CASIM/SOCREATES sources from MOSRS) doesn't work and these configuration file parameters should point to where one has checked out the SOCRATES/CASIM source by hand.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This last point might be quite vague, depending on the user, if they are not familiar with mosrs, checking out branches, and the specific files for socrates. Not a problem now, but just if the user base grows, perhaps!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree. What do you think we should write instead? I didn't take more detailed notes when we typed up the field guid. But maybe we could mention the specific files required? Do you feel like writing a suggestion?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm happy to write a suggestion. Or what I did, at least! I'll email over or add as a comment.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would write something like this:

Checking out SOCRATES/CASIM source code by hand

These steps are the same for both SOCRATES and CASIM by using the appropriate name and path. First, create a branch for SOCRATES/CASIM from the command line on ARC4:

$> fcm bc <branch_name> fcm:<socrates/casim>.x_tr@<version_number>

Navigate to the directory you wish to store the source code and check out your branch:

$> fcm co <your_branch>

Where <your_branch> is the full Met Office URL printed at the end of branch create:

[info] Created: https://code.metoffice.gov.uk/svn/... 

If you want to build MONC with these branches, you need to change the source-code location in the fcm configuration file (e.g. fcm-make/casim_socrates.cfg) to point to your branches on MOSRS.

Change the SOCRATES/CASIM fcm-make .cfg file to point to your branches:

extract.location{primary}[casim] = fcm:casim.x_br
$casim_revision{?} = <revision_number>
extract.location[casim]  = branches/dev/<user>/<branch>
extract.location{diff}[casim]  = 
extract.path-incl[casim] = src
extract.path-excl[casim] = /
preprocess.prop{fpp.defs}[casim] = DEF_MODEL=MODEL_MONC MODEL_MONC=4

extract.location{primary}[socrates] = fcm:socrates.x_br
$socrates_revision{?} = <revision_number>
extract.location[socrates]  = branches/dev/<user>/<branch>
extract.location{diff}[socrates]  = 
extract.path-incl[socrates] = src/modules_core src/radiance_core
# exclude these modules since they conflict with CASIM
extract.path-excl[socrates] = / src/modules_core/missing_data_mod.F90 src/modules_core/parkind1.F90 src/modules_core/yomhook.F90

Build MONC with SOCRATES/CASIM using the fcm command and correct SOCRATES/CASIM config file, e.g.

$> fcm make -j4 -f fcm-make/monc-arc4-gnu.cfg -f fcm-make/casim_socrates.cfg

There are reference files for radiation calculations for SOCRATES that are hard-wired for use on MONSooN by default. The location of these needs to be updated in any MONC config (.mcf) files to point to your checked out branch.

spectral_file_lw = /projects/monc/fra23/socrates_spectra/ga7/sp_lw_ga7
spectral_file_sw = /projects/monc/fra23/socrates_spectra/ga7/sp_sw_ga7

can be changed to your local branch of SOCRATES, e.g.:

spectral_file_lw = /home/home02/<user>/socrates/data/spectra/ga7/sp_lw_ga7
spectral_file_sw = /home/home02/<user>/socrates/data/spectra/ga7/sp_sw_ga7

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great Rachel. A possible addition could be explicit instructions to run a fcm commit after making any changes to the casim or socrates code before compiling.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for writing this @eers1! I'm a bit confused though @eers1 and @cemac-ccs: is it really necessary to create a new branch on MOSRS and to commit back to it? We're not making any changes to CASIM/SOCRATES here, we're only trying to get specific revisions of the code so they can be compiled into MONC, no?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is true, but having instructions on how to compile with a local source would be useful, and it is most likely that those who have local sources will have made changes to the code themselves.

A side note that I came across when running on Archer2 is that when defining the location of the socrates spectral files, if you use the ones in the monc source it may fail due to the absence of the sp_sw_ga7_k and sp_lw_ga7_k files in the same folder. I am not sure whether the fcm download of socrates can be set up to put these files in the correct place.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just coming back to this now, but yes sorry I maybe got carried away with making and checking out new branches. So without doing that, do you just change the SOCRATES/CASIM fcm-make .cfg file revision number but keep it pointing to the trunk? Also, I'm not sure how to point to the SOCRATES files without having them locally.

Like @cemac-ccs says, might be nice to keep the bit about new branches and have info about making changes to CASIM/SOCRATES, committing the changes and compiling with those branches.

Copy link
Collaborator

@cemac-ccs cemac-ccs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be fine once the comments are addressed and the PRs have been brought in. I have started work on a PR for ease of compilation so that the compilation process for arc4 consists of running the script at utils/arc/monc_compile_arc.sh and selecting the appropriate compilation option (like I have for archer2). It just needs password caching properly working.

```

NOTE: before submitting the job ensure that the "standard out" path is cleared, so that the job isn't restarted from a previous run.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be a good idea to create a restart script, similar to the one Craig was using (and the one that exists within the /misc/continuation.sh script)? I believe that all the code that we would need to add to the continuation script is in Craigs branch submission script.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A continuation script would be great! Richard Rigby did make one for ARC4 that he gave to me, I just hadn't got around to testing it since I worked through all my errors.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a continuation script could be a good idea, but I feel like they often get quite long an unwieldy and it's difficult to account for all the possible use cases. Are you thinking one that work with MONC used on any HPC system? And what changes would need to be made input files before running the restart file? Maybe you could open a new issue on this and we could hash out some ideas :) What do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think moving that discussion to a new issue is wise.


Caching password is covered in above MOSRS link and here: [http://cms.ncas.ac.uk/wiki/MonsoonSshAgent](http://cms.ncas.ac.uk/wiki/MonsoonSshAgent).

## 2. Check fcm keywords
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compilation script in PR #48 includes keyword handling by using sed to modify the contents of the keyword.cfg file to point to the working folder, and sending the output of the sed command to ~/.metomi/fcm/keyword.cfg.

$> svn info https://code.metoffice.gov.uk/svn/test
```

Caching password is covered in above MOSRS link and here: [http://cms.ncas.ac.uk/wiki/MonsoonSshAgent](http://cms.ncas.ac.uk/wiki/MonsoonSshAgent).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The page linked only really deals with setting up ssh keys between monsoon and puma. I believe the correct link is [https://code.metoffice.gov.uk/trac/home/wiki/AuthenticationCaching]. It should however be noted that those instructions are not correct for these purposes as they require rose to be installed to be used as a checker.
I am going to speak to Richard about how best to modify the gpg scripts to be usable in on arc4 without rose and whether we can include the scripts in the cemac area on arc4 to be imported as a module or included in the PATH as part of the compilation script. Without some password caching, the branch create functionality won't work for Rachel's manual method and compiling with the automated checkout requires the user to enter their password 4 times for compiling with either casim or socrates, and 7 times if compiling with both (also if compiling with both, it will ask for the password 3 times, then twice on the same line. If the user doesn't realise that it is waiting for the password to be entered twice for that line, the user will think there is an indefinite hang in the compilation script)

For compilation on Archer2 I have added a call to cache the password, so I will include that in the compilation script for ARC4 once I have got the caching actually working

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update - I have spoken to richard and the issue with gpg password caching was due to the version of svn on arc4. The following needs to be added to allow password caching on arc4:

. /nobackup/cemac/cemac.sh
module load svn
export PATH=/nobackup/cemac/mosrs:$PATH
mosrs-setup-gpg-agent

I have added these lines to the compilation script.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just tried giving this a go, because I was previously just mashing my password 4327 times.. it's coming up that the "mosrs-setup-gpg-agent" command is not found. Do I need to get that from somewhere else?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I had the permissions wrong. Give it a try now

Copy link
Collaborator

@cemac-ccs cemac-ccs Apr 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correction - the command is source mosrs-setup-gpg-agent or . mosrs-setup-gpg-agent. Just calling the script will not allow the gpg agent to persist.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, it's not finding the file..

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it is best that I add a comment that password-caching doesn't currently work on ARC4? I'd like to get these instructions into master and then we can improve on them.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That might be wise in the short term Leif.

Rachel, which file is it saying it can't find? Did you run all four lines? You should be able to confirm whether it worked or not by running svn info --non-interactive https://code.metoffice.gov.uk/svn/test. If it worked properly then it should give you a status output.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's when I try the command that is source mosrs-setup-gpg-agent or . mosrs-setup-gpg-agent. It says

mosrs-setup-gpg-agent: No such file or directory

I ran all four lines and tried both variations of that command and when I run the test it fails.


You need to place the `casim`/`socrates`/`casim_socrates` fcm make config file _after_ the MONC file.

You can change which versions of CASIM/SOCRATES are fetched from MOSRS by changing the `casim_revision` and `socrates_revision` variables in the `.cfg`-files. At time of writing the versions to use with MONC `0.9.0` are revision `um10.8` for SOCRATES and revision `6341` for CASIM. Later versions may require changes to MONC for compatibility.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually revision 358 (which is what get checked out when you use the um10.8 flag). I might have mispoken on our call last week. Looking at the casim_socrates_mirror.cfg file however, it seems that 593 is also a good revision to use.

number_q_fields=9
```

3. External files providing reference profiles for radiation calculations (SOCRATES) and microphysics (CASIM). Craig Poku noted that providing _relative paths_ for these files based on where CASIM/SOCRATES is checked out within the MONC source tree (when fcm is used to fetch CASIM/SOCREATES sources from MOSRS) doesn't work and these configuration file parameters should point to where one has checked out the SOCRATES/CASIM source by hand.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great Rachel. A possible addition could be explicit instructions to run a fcm commit after making any changes to the casim or socrates code before compiling.

@leifdenby leifdenby mentioned this pull request May 4, 2021
@leifdenby
Copy link
Collaborator Author

@cemac-ccs based on your work in #48 I should add a line in the start of the section c. Compiling MONC with SOCRATES and CASIM indicating that that compile script now takes care of setting up the fcm keywords and obtaining the source-code.

@eers1 and @cemac-ccs: I think it might be worth considering the situation where people want to modify CASIM/SOCRATES themselves separately from simply running MONC with either. I could take the notes you wrote @eers1 and write that people are encouraged to create their own branch on MOSRS if they are planning on modifying CASIM/SOCRATES and then show what changes need to made to use those branches on MOSRS. If someone isn't planning on modifying CASIM/SOCRATES then they don't need to create a branch on MOSRS, right?

@cemachelen
Copy link

hi I found some parts of the set up hard to follow getting mosrs caching password for example.

If it helps I created some additional cemac modules via the central cemac account so the module loading would look like

# load cemac modules
. /nobackup/cemac/cemac.sh
# Note default svn does not work
module switch intel gnu/8.3.0
module load fftw netcdf hdf5 fcm
module load svn 
module load rose
module load mosrs

then I could run
. mosrs-setup-gpg-agent
from a user account

@cemachelen
Copy link

cemachelen commented Nov 1, 2021

also the test cases using Socrates all require pointing to a spectra file. Does it help to have some in a centralized location?

I added them in a persistent location on arc4 in case it is useful

/nobackup/cemac/socrates_1706/data/spectra/ga7

@leifdenby
Copy link
Collaborator Author

hi I found some parts of the set up hard to follow getting mosrs caching password for example.

If it helps I created some additional cemac modules via the central cemac account so the module loading would look like

This is great @cemachelen, thanks you! Would you mind branching from my branch, changing the things that are unclear and then just posting a link to your branch here? I'll then merge your changes into the branch of this pull request and get it merged in :)

@leifdenby
Copy link
Collaborator Author

also the test cases using Socrates all require pointing to a spectra file. Does it help to have some in a centralized location?

I added them in a persistent location on arc4 in case it is useful

That's a good idea! If that directory will continue to exist then we can use that. Could you add note to the documentation about that too?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants