Python Environments

To support a variety of python-enabled workflows, users may want separate python environments where they can install specific packages or versions of packages for later use. Since python3.3, a python module allowing for the creation of “virtual environments” is available. This module allows users to create any number of named virtual environments with different packages installed.

 

Background

All recent versions of python allow for the user to install their own “local” python packages with modules like pip. However these package installs are installed alongside the python “base” interpreter. Therefore, in order to install a newer version of a package or a package that requires a newer version of a previously installed package, you’d need to change this installation. Such changes can lead to issues when shifting research workflows and/or preserving an environment for reproducing workflows or important results. It is therefore preferred that users make frequent use of virtual environments. Virtual environments are light-weight sets of packages that are independent and isolated from the “base” python interpreter. They can be named and “toggled” on and off or otherwise switched between to accommodate the evolving research needs of the user.

 

Python VEnv

On all versions of python newer than python 3.3, a venv module is available. To use this, simply ensure that your base version of python is at least python3.3 (via some module commands) and read through the following webpage: https://docs.python.org/3/library/venv.htmlPlease be sure to store this virtual environment in your research storage volumes and not in your home directory. I.e., store in something like /umbc/xfs1/groupName/users/randomUser

With careful construction, an entire research group may be able to share a virtual environment if stored in the research volume common directories.

 

Anaconda

Also supporting virtual environments within python (and other programming languages) is Anaconda. It represents slightly more overhead than a python virtual environment as discussed above, but is still widely used.

Loading the Module

module load Anaconda3/2020.07
The first time you use conda, you’ll need to run conda init bash or something similar in order for conda to interact with your BASH environment appropriately.

Getting Started

Create your Environment

Use the conda create command to build a virtual environment. The environment files are saved to ‘/home/<username>/.conda/envs’  directory, this directory is linked to your ada research storage volume. You can specify the other location with the help of --prefix the option in the command. You should set meaningful environment names for easy readability and reference.

With conda

The user can specify the name, python versions, packages, and package versions at the environment creation time. Once a conda environment is created, a user can install packages using either conda install <package> or pip install <package>.

conda create --name=env_name <package1> <package2> ... <packageN>

or

conda create --name=env_name

followed by
conda activate env_name
pip install <package1> <package2> <package3> ... <packageN>

Using an environment

In order to “activate” a conda environment, you’ll need to run something like the above conda activate env_name.

If you’re in an interactive session (on the login node or in an srun session), that’s all you’ll need.

If you’re in a batch session, you must first run the following command within your SLURM batch script in order for your BASH environment to interact with conda correctly.
eval "$(conda shell.bash hook)"

To deactivate the environment :
conda deactivate

 

Installing Packages

You can install required packages using conda or pip. Attempt to install packages as much as possible with pip, then use conda if the packages are unavailable through pip. The general process is to activate an environment and then run pip/conda commands to install new packages into this environment. It is imperative that the environment you’re attempting to install the new package into is activated when you run the below commands.

With pip

Pip organizes packages alongside python interpreters better than conda (at the time this was written; 20210901), so it’s advisable to at least start with pip. To use pip for installation into an existing environment, run the following command with your conda environment activated:

pip install <package-name>

Pip works by downloading necessary files into /tmp and then using those files to install into your ~/.conda/env/env_name. After the installation finishes, the downloaded files are removed from the temporary directory. For large packages, /tmp sometimes gets filled. In these cases, you can specify an alternate temporary directory for this part of the installation.

TMPDIR="/nfs/ada/<group>/users/<username>" pip install <package-name>

With conda

To install a specific package into an existing current environment “env_name”, use the following:

conda install <package-name>

If you want to install the package in a specific  environment, mention the required environment name using the below command :

conda install --name env_name <package-name>

Additional Conda Commands

List all of your environments :
conda info --envs
or
conda env list

Delete a Conda Environment :
conda remove --name <environment-name> --all
or
conda env remove --name <environment-name>

View a list of packages installed in an environment:

If the environment is already activated, run

conda list
or
conda list -n <environment-name>