To support a variety of python-enabled workflows, users may want separate python environments where they can install specific packages or versions of packages for later use. Since python3.3, a python module allowing for the creation of “virtual environments” is available. This module allows users to create any number of named virtual environments with different packages installed.
Background
All recent versions of python allow for the user to install their own “local” python packages with modules like pip
. However these package installs are installed alongside the python “base” interpreter. Therefore, in order to install a newer version of a package or a package that requires a newer version of a previously installed package, you’d need to change this installation. Such changes can lead to issues when shifting research workflows and/or preserving an environment for reproducing workflows or important results. It is therefore preferred that users make frequent use of virtual environments. Virtual environments are light-weight sets of packages that are independent and isolated from the “base” python interpreter. They can be named and “toggled” on and off or otherwise switched between to accommodate the evolving research needs of the user.
Python VEnv
On all versions of python newer than python 3.3, a venv
module is available. To use this, simply ensure that your base version of python is at least python3.3 (via some module commands) and read through the following webpage: https://docs.python.org/3/library/venv.html. Please be sure to store this virtual environment in your research storage volumes and not in your home directory. I.e., store in something like /umbc/xfs1/groupName/users/randomUser
With careful construction, an entire research group may be able to share a virtual environment if stored in the research volume common directories.
Anaconda
Also supporting virtual environments within python (and other programming languages) is Anaconda. It represents slightly more overhead than a python virtual environment as discussed above, but is still widely used.
Loading the Module
module load Anaconda3/2020.07
The first time you use conda, you’ll need to run conda init bash
or something similar in order for conda to interact with your BASH environment appropriately.
Getting Started
Create your Environment
Use the conda create
command to build a virtual environment. The environment files are saved to ‘/home/<username>/.conda/envs’ directory, this directory is linked to your ada research storage volume. You can specify the other location with the help of --prefix
the option in the command. You should set meaningful environment names for easy readability and reference.
With conda
The user can specify the name, python versions, packages, and package versions at the environment creation time. Once a conda environment is created, a user can install packages using either conda install <package>
or pip install <package>
.
conda create --name=env_name <package1> <package2> ... <packageN>
or
conda create --name=env_name
followed by
conda activate env_name
pip install <package1> <package2> <package3> ... <packageN>
Using an environment
In order to “activate” a conda environment, you’ll need to run something like the above conda activate env_name.
If you’re in an interactive session (on the login node or in an srun session), that’s all you’ll need.
If you’re in a batch session, you must first run the following command within your SLURM batch script in order for your BASH environment to interact with conda correctly.
eval "$(conda shell.bash hook)"
To deactivate the environment :
conda deactivate
Installing Packages
You can install required packages using conda or pip. Attempt to install packages as much as possible with pip, then use conda if the packages are unavailable through pip. The general process is to activate an environment and then run pip/conda commands to install new packages into this environment. It is imperative that the environment you’re attempting to install the new package into is activated when you run the below commands.
With pip
Pip organizes packages alongside python interpreters better than conda (at the time this was written; 20210901), so it’s advisable to at least start with pip. To use pip for installation into an existing environment, run the following command with your conda environment activated:
pip install <package-name>
Pip works by downloading necessary files into /tmp and then using those files to install into your ~/.conda/env/env_name. After the installation finishes, the downloaded files are removed from the temporary directory. For large packages, /tmp sometimes gets filled. In these cases, you can specify an alternate temporary directory for this part of the installation.
TMPDIR="/nfs/ada/<group>/users/<username>" pip install <package-name>
With conda
To install a specific package into an existing current environment “env_name”, use the following:
conda install <package-name>
If you want to install the package in a specific environment, mention the required environment name using the below command :
conda install --name env_name <package-name>
Additional Conda Commands
List all of your environments :
conda info --envs
or
conda env list
Delete a Conda Environment :
conda remove --name <environment-name> --all
or
conda env remove --name <environment-name>
View a list of packages installed in an environment:
If the environment is already activated, run
conda list
or
conda list -n <environment-name>