BSSw Community Discussion #2

amaji · 2022-04-21T12:16:14Z

The objective of this thread is to collect requirements for improving conda-env-mod and converting it into a production software that can be used across multiple HPC centers.

For a high-level overview of conda-env-mod please see the slides under presentation directory or refer to our HUST20 paper:

Amiya K. Maji, Lev Gorenstein, and Geoffrey Lentner. "Demystifying Python Package Installation with conda-env-mod." 
In Proceedings of the 7th Annual Workshop on HPC User Support Tools (HUST 2020), pp. 1-10. 2020.

Please add your suggestions as comments in this thread. If a discussion topic becomes too complicated we can move it into its own thread/issue.

Suggested topics:

What are the common issues users face at your HPC center related to Python package installation?
What are the common mistakes users make?
What best practices do you follow to keep your Python packages under control?
Which Python environment variables do you use in your environment (e.g., PYTHONPATH, PYTHONUSERBASE etc.)?
Do you use any sanity checks when installing Python packages? What checks do you use?

Current features of conda-env-mod

Supports conda environments
Environment creation/deletion
Lmod module files
Module file creation/deletion
Jupyter kernel creation/deletion

Proposed additional features

Add support for multiple environment backends: conda, venv/pipenv
Add support for different module file formats: Lmod, TCL, Bash/Csh
Add support for site-customized module templates
Add support for descriptive help messages
Add support for creating environment from yaml specs
Capture compiler/mpi dependencies in environment
Stretch goals
- Add support for popular package installation
- Dependency checking
- Inform developers about missing dependencies and misconfigurations

The text was updated successfully, but these errors were encountered:

frobnitzem · 2022-06-07T16:15:39Z

Finding that our visualization cluster's conda module is currently based on python/3.7, I installed and tried this today.

First, I downloaded and ran the official installer to put anaconda3 into a project-work directory (I'll call $proj) for program files. The $proj directory name contains both a project ID and a system name so I don't cross architectures.

Then, I cloned this repo and ran some commands. At first, I thought I should source conda-env-mod so that functions like create_env were available. An exit 1 told me that was wrong.
Next, I copied conda-env-mod into $proj/bin and set up an env.sh like this:

export CONDA_PREFIX=$proj/anaconda3
export CONDA_ENVS_PATH=$proj/envs
export PATH=$CONDA_PREFIX/bin:$PATH:$proj/bin

Adding conda's python (and conda executable) to my PATH should be documented. It makes sense though - since I want to use the newest python and conda by default. However, it might be nice to default to working with an isolated anaconda so the user works exclusively with conda-env-mod.

We probably need a separate thread to address all the security concerns with using conda, but one that I notice right away is how creating a new environment makes its own copy of openssl and ca-certificates. Our site prefers us to use their installation because they are very good at keeping it up to date.

The environment I created with conda-env-mod create -n analysis went into the project directory as expected, but the modules associated with it went to $HOME/privatemodules. When I realized that, I changed MODULE_TOP_DEF in the script to point to $proj/modules.

It would be good to warn users about conda caches - which can quickly grow to tens of gigabytes. Documentation should say where conda keeps its download and compile cache so that I can clear it and/or make sure it doesn't get added to automatic site backups.

Summarizing my comments -- the modules-based wrapper for conda is definitely a nice way to link with HPC cluster users. I suggest some more documentation about installing and setting up conda-env-mod so it's accessible within a shared project.

As for feature improvements, I would go in a different direction from your suggestions above. I'm a big fan of version-controlling environments. Modules kind of does that by using the "name/version" naming scheme. Why not setup conda-env-mod to be able to create an environment from a git repository of a conda env-yaml file? Then a project group can debate changes to the environment via github.

amaji · 2022-06-08T18:01:47Z

@frobnitzem Thank you for the detailed comments and suggestions! We certainly need to make an effort to make this easier to install and use (and have better documentation).

Also like your last suggestion. Creating environments from myenv.yaml (much like conda env -f) is on our radar and we will implement that.

Regarding module files: You can change the location of module files with the -m /path/to/modules. No need to manually edit MODULE_TOP_DEF.
https://github.com/amaji/conda-env-mod/blob/master/share/man/man1/conda-env-mod.1.md

We have also been discussing about adding CONDA_CACHE, PIP_CACHE etc. in the module files. How about setting the default cache location to CONDA_ENVS_PATH?

hnamLANL · 2022-06-13T15:29:13Z

The proposed additional features are vague. I recommend providing additional detail on the requirements so we get a better sense of what the requirement entails.
I recommend trying to install on different ECP-focused HPC systems (e.g. NERSC, OLCF, ALCF) so you can see the issue first hand of managing Python on systems that high highly optimized vendor controlled environments. Let me know if you need help getting access. We can make that happen via ECP accounts.

amaji · 2022-06-13T15:41:46Z

@hnamLANL
That would be great! Can you please get me access to these systems?
Seeing how the python environment looks like would certainly help a lot.

frobnitzem · 2022-06-13T15:51:00Z

I should also write a response to your questions!

Suggested topics:
- What are the common issues users face at your HPC center related to Python package installation?
  Site-provided python versions have some packages installed, but not all of the ones I need. Adding the packages I need works so long as they're pure python and use pip (setting PYTHONPATH), but sometimes pip fails, and sometimes the site's provided packages conflict with newer versions I need. Re-compiling the base packages provided by the centers (like mpi4py) is tricky because compiled library dependence requires a mix of system library paths and modules that is not easy to determine.
- What are the common mistakes users make?
  Users tend to put software in their $HOME directory. This prevents multiple users from collaborating effectively on the same project! In some cases, it just duplicates effort. In others, it leads to different results for different users. Most users are also not aware of some "easy" methods to create virtual environments, extend existing conda environments or patch python packages with PYTHONPATH.
- What best practices do you follow to keep your Python packages under control?
  I try and maintain requirements.txt or poetry-ini config files in each program I develop. This can still be tricky if I need to maintain python 3.7 and 3.9 compatible installs -- say because I'm running on multiple system environments. I also try and keep one install directory per project and delete all python local files from $HOME. If using a pure python package across systems, python's cache can cause problems. I get around this by creating different venv install directories on different systems, using code like case $systemname in summit) module load ...; source summit/bin/activate;; andes) module load ...; source andes/bin/activate;; esac;.
- Which Python environment variables do you use in your environment (e.g., PYTHONPATH, PYTHONUSERBASE etc.)?
  Just the first one.
- Do you use any sanity checks when installing Python packages? What checks do you use?
  I rely on pip's dependency checking - which usually works.
Current features of conda-env-mod
- Supports conda environments: I'm still not entirely happy installing binary packages and would rather compile them myself.
- Environment creation/deletion: It's nice to have a wrapper. Conda's commands are hard to remember.
- Lmod module files: useful
- Module file creation/deletion: need to be able to list these too
- Jupyter kernel creation/deletion: have not tried
Proposed additional features
- Add support for multiple environment backends: conda, venv/pipenv: Documentation for an idea of how-to, rather than precise, scripts will be extremely helpful.
- Add support for different module file formats: Lmod, TCL, Bash/Csh: I'm happy with lmod only.
- Add support for site-customized module templates: This requires some experience with systems before defining a "customization convention" that works across sites. Probably the key customizations will be project-install paths and dependencies on site-provided packages.
- Add support for descriptive help messages: I think installation documentation and an online FAQ would be more helpful.
- Add support for creating environment from yaml specs: I'd like this to be implemented with a "sync to yaml" command. It can look at one yaml file with a module version number. If the environment is already there, nothing happens (or an error is thrown if a missing/incorrect version package is detected). If the environment does not exist, a new module is created for it.
- Capture compiler/mpi dependencies in environment: See spack's compilers.yaml and packages.yaml file formats for some hints here. It would be great if a subset of that information could be captured and used by this tool - e.g. external path / module for a dependency package.
- Stretch goals: More user support is really hard. I think it's better to build code with solid install/test/diagnose functionality, then encourage users to report their own experiences with external packages, mis-configurations, etc. Target improvements to your tools reporting this way.

hnamLANL · 2022-06-13T17:59:06Z

@amaji - sending you information on getting access to Perlmutter/NERSC via email.

sswan · 2022-06-15T15:22:34Z

For this tool to scratch my particular itch, it will need to address the issue of a team creating an environment for their product in a way that the user cannot change. Our product requires certain versions and packages to function correctly and the last thing I want to do is help a bunch of users to correctly configure their own python environments. As it is, it's easier to have our own install of what we need for our product everywhere we use our product than to try to use a system-wide solution.

shuds13 · 2022-06-15T17:51:12Z

To the first question:

What are the common issues users face at your HPC center related to Python package installation?

As a user the most common issues I face are to do with mpi4py installation, which needs to match with the correct MPI. Centers may provide a base conda environment with the correctly installed mpi4py (and possibly other things) but I cannot install packages on top of that environment, so I have to create my own. I can clone the environment, but this brings all packages with it (taking a long time and multiple GB).

I would like to be able to 'extend' a central environment with my own (similar to --user, but still within conda). Some places (e.g. NERSC lazy-mpi4py provide a clonable environment with small number of key machine-specific packages (including mpi4py). This is the best approach so far.

I guess if this tool makes it easy to create and share an environment as a module, then it would be easier for anyone to create such modules.

Test drive of package:

I was able to run and share a basic module, with help from https://www.rcac.purdue.edu/knowledge/brown/run/examples/apps/python/packages. It did refer to a rcac-conda-env script - is this an old name?

The only issue I had is that when I ran the script, it created an environment and installed its own python - but the bin dir did not get prepended to my PATH so was not picking up that python. I had to activate the environment for that to happen, or add it manually. I wasn't clear whether I should need to activate.

Also, I got this warning.
WARNING: Couldn't find an anaconda module.

An anaconda3 module was loaded, but it seems this does not set $CONDA_MODULE on my system.

Other thoughts:

Simplifying Jupyter support is nice.

Different module file format seems a good idea, I cant really say about different environment backends, as I only use conda, to my mind its pretty dominant (but others can comment on that).

Any ability to reverse by storing snapshots could be useful. Conda has some ability with revisions, but from memory I think it only works properly for conda installed packages.

amaji added the discussion Thread for discussing one or more topics label Apr 21, 2022

amaji assigned amaji and lgorenstein Apr 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BSSw Community Discussion #2

BSSw Community Discussion #2

amaji commented Apr 21, 2022 •

edited

Loading

frobnitzem commented Jun 7, 2022 •

edited

Loading

amaji commented Jun 8, 2022

hnamLANL commented Jun 13, 2022

amaji commented Jun 13, 2022

frobnitzem commented Jun 13, 2022 •

edited

Loading

hnamLANL commented Jun 13, 2022

sswan commented Jun 15, 2022

shuds13 commented Jun 15, 2022

BSSw Community Discussion #2

BSSw Community Discussion #2

Comments

amaji commented Apr 21, 2022 • edited Loading

frobnitzem commented Jun 7, 2022 • edited Loading

amaji commented Jun 8, 2022

hnamLANL commented Jun 13, 2022

amaji commented Jun 13, 2022

frobnitzem commented Jun 13, 2022 • edited Loading

hnamLANL commented Jun 13, 2022

sswan commented Jun 15, 2022

shuds13 commented Jun 15, 2022

amaji commented Apr 21, 2022 •

edited

Loading

frobnitzem commented Jun 7, 2022 •

edited

Loading

frobnitzem commented Jun 13, 2022 •

edited

Loading