Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How packages make it into stsci metapackage? #397

Open
pllim opened this issue Aug 14, 2018 · 15 comments
Open

How packages make it into stsci metapackage? #397

pllim opened this issue Aug 14, 2018 · 15 comments

Comments

@pllim
Copy link
Contributor

pllim commented Aug 14, 2018

@tddesjardins asked why some packages like synphot, stsynphot, and webbpsf don't get installed by default using conda create -n astroconda stsci.

@jhunkeler
Copy link
Contributor

Should they be? Is there a functional difference between synphot, stsynphot and pysynphot? Would adding the other packages to stsci cause confusion among end-users (i.e generate tickets asking why there are three synphot packages)?

If I remember correctly, webbpsf was removed from stsci because it was huge and not everyone was going to use it. We opted not to bloat the environment size by several hundred megabytes and just let users install webbpsf whenever they needed it. (cc @mperrin)

@pllim
Copy link
Contributor Author

pllim commented Aug 14, 2018

Should they be?

Maybe @tddesjardins can comment on this.

@tddesjardins
Copy link

I'm fine with leaving off webbpsf. My question was more to do with the synphot stuff as it seems like we're moving towards using synphot and stsynphot over pysynphot. At least, that was the direction I got from @pllim and Harry (sorry, don't know his username!).

@mperrin
Copy link
Contributor

mperrin commented Aug 14, 2018

You remember correctly! webbpsf-data is about 350 MB (and used to be even larger in some earlier versions) so we decided not to make that part of the default. Some people thought we were taking up disk space unnecessarily for something they wouldn't use. It's easy enough to conda install it individually if you do want it, so there did not seem to be a substantial down side to making it an optional install.

@mperrin
Copy link
Contributor

mperrin commented Aug 14, 2018

synphot would seem to be a similar case, since it relies on various potentially large data files (libraries of stellar atmospheres, etc) which I believe are also many hundreds of MB.

PS Incidentally I too find the multiple versions of *synphot to be confusing and arguably user-hostile. Yes I understand there's historical reasons, but it's not a great situation in the long run...

@pllim
Copy link
Contributor Author

pllim commented Aug 14, 2018

synphot would seem to be a similar case

Not really. Data files are managed separately by RedCat and not distributed with the package.

cc @hcferguson for other discussions.

@tddesjardins
Copy link

Correct me if I'm wrong, though, the file dependencies for *synphot are not downloaded through conda, correct? You have to go to the CRDS pages and download the reference file data for those.

@stscicrawford
Copy link

I'm also helping RedCat to take a look at how to host those files -- it might be something to think about for Webbpsf as well. @mperrin -- should I open a separate issue in Webbpsf? While easy to install, it might be useful to have it part of the jwst pipeline with the option of grabbing the files if needed.

@pllim
Copy link
Contributor Author

pllim commented Aug 14, 2018

are not downloaded through conda, correct?

Correct! And in an ideal world, you only download what you need.

@mperrin
Copy link
Contributor

mperrin commented Aug 14, 2018

@stscicrawford Thanks, but actually for WebbPSF we have an effective solution already. The webbpsf-data conda package is a lightweight wrapper for retrieving the .tar.gz file with the data and storing it as part of someone's conda environment. In this case we don't need finer granularity of that, and doing it this way also allows us to manage the versioning consistently for the code and data files.

@mperrin
Copy link
Contributor

mperrin commented Aug 14, 2018

Which is to say, I'm not opposed to some alternative way of providing or hosting the data files for webbpsf, if it's useful for some other reason. But right now I don't see any clear need that would drive that as a priority.

@tddesjardins
Copy link

I guess let's reverse the question and ask if the *synphot files should be managed via conda similar to the webbpsf-data package? We have been having this issue of how best to obtain the files from CALSPEC etc.

@pllim
Copy link
Contributor Author

pllim commented Aug 14, 2018

😱 (backs away)

@stscicrawford
Copy link

@tddesjardins That is currently what I am looking at and investing different options for hosting the files and making it easier for them to be downloaded. We are still in the scoping stage, and I've been more looking at how the data is stored and versioned. Please feel free to send me your thoughts on how you'd like to access these data sets.

@jhunkeler
Copy link
Contributor

jhunkeler commented Aug 14, 2018

Three main reasons why *synphot data was never turned into Conda package(s):

  1. The data is not versioned. The tarballs are replaced on the server whenever new data becomes available. That's not something we can work with.
  2. A single change requires a total repack of the data set
  3. Eventually our channel would contain a lot of very large dead packages no one will ever touch.

*This has been discussed on numerous occasions with different people since 2015.
*webbpsf's data releases are infrequent and relatively small.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants