Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data transfer and HPSS storage of CS catalogs #394

Open
29 tasks
heather999 opened this issue Apr 17, 2020 · 8 comments
Open
29 tasks

Data transfer and HPSS storage of CS catalogs #394

heather999 opened this issue Apr 17, 2020 · 8 comments
Assignees

Comments

@heather999
Copy link
Collaborator

heather999 commented Apr 17, 2020

As discussed this week at the CS meeting and in issue LSSTDESC/desc-help#11, there is a planned data transfer at NERSC from projecta to CFS. This includes the cosmoDC2 catalogs currently stored under /global/projecta/projectdirs/lsst/groups/CS. The CS group has identified those catalogs that should remain active on CFS (as well as stored to NERSC HPSS) and those that can be copied to HPSS and removed.

Here is the list of catalogs, and their sizes, that will remain available on the CFS:

  • 1.3T BC
  • 311G BPl
  • 2.9T GalacticusLibraries
  • 382G Outer_snapshots
  • 1.5T all_v0.1.0
  • 901G baseDC2_9.8C_v1.1_velocity_bug_fixes
  • 2.6T baseDC2_snapshots_v0.1
  • 259G baseDC2_cosmoDC2_v0.1_v0.2
  • 8.7T cosmoDC2_v1.0.0_full_highres
  • 34G cosmoDC2_v1.0.0_knots_addon
  • 5.2T cosmoDC2_v1.1.3_rs_scatter_query_tree
  • 43G cosmoDC2_v1.1.4_knots_addon
  • 2.2G trial_triaxial_sats
  • 82G um_snapshots
  • 5.2T cosmoDC2_v1.1.4_rs_scatter_query_tree_double
  • 171G plarsen_tmp

The list of catalogs that will be copied to NERSC HPSS and removed:

  • 124G all_v0.2.0
  • 225G all_v0.2.0_small
  • 247M alphaq
  • 901G baseDC2_9.8C_v1.1
  • 8.7T cosmoDC2_v1.0.0
  • 24G cosmoDC2_test_z0_1
  • 1.4T cosmoDC2_v0.2.0
  • 629G cosmoDC2_v1.0.1_small_test
  • 5.2T cosmoDC2_v1.1.0
  • 43G cosmoDC2_v1.1.0_knots_addon
  • 5.2T cosmoDC2_v1.1.0_shear
  • 63G mass_sheets_behind_z3
  • 9.7G shear_addon
@JoanneBogart
Copy link
Contributor

I believe these directories comprise the whole of /global/projecta/projectdirs/lsst/groups/CS/cosmoDC2. Are they all catalogs? Of the items in the top list, only about half are registered in GCR. Perhaps it's reasonable to call the others catalogs as well; I'd just like to confirm.

@heather999
Copy link
Collaborator Author

I think we would need to check with @evevkovacs and @yymao to see if all the catalogs that will remain on disk should be referenced in GCR.

@yymao
Copy link
Member

yymao commented Apr 22, 2020

No, not all of the subdirectories that CS asks to keep active on CFS need to be made available in GCRCatalogs. Some of these are intermediate data products that we don't expect regular end users to use.

@evevkovacs
Copy link
Contributor

The other directories are mostly auxiliary data that were used to make the catalogs and need to be kept. We can rethink this if need be, but for now I think it makes sense to keep them in the catalogs directory. They are all relevant to past or ongoing work.

@katrinheitmann
Copy link
Contributor

With the new filesystem, has this all been sorted out? If not, what else needs to be done? @heather999 @yymao @evevkovacs. Maybe it's possible to write a very brief conclusion and close this issue? Thanks!

@heather999
Copy link
Collaborator Author

We still want to back up all the catalogs and those that have been identified for removal can be removed from CFS.
There has been so much other work going on reorganizing CFS that I haven't gotten back to backing up the catalogs yet. I want to use this issue to keep track of my progress as things are copied to HPSS.

@wmwv
Copy link
Contributor

wmwv commented Jan 22, 2021

@heather999 Can this be marked as done?

@heather999
Copy link
Collaborator Author

Unfortunately, not yet. Hoping to get this completed over the next couple of weeks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants