Storage

Storage on the ada cluster environment is comprised of a personal home directory, shared group storage, and scratch storage. The first two types of storage are remotely hosted volumes with Network File System (NFS). In this way, all of these volumes are available across the ada cluster environment and each guarantees close-to-open consistency via NFS protocol [1]. Scratch storage is local to your allocated compute hardware, and should be considered the fastest but also most transient. At the end of this page you will find some easy commands for checking your data usage.

Throughout this document and for the purposes of generalization, consider $USER as a reference to a username and $GROUP a reference to the username of the faculty or staff group sponsor. For example, GROUP=smith8 would refer to group pi_smith8.


Home directory

This is the home directory for your account, and this will be your working directory immediately after login. This should be used for setting up your Unix user environment, as well storing small amounts of data (e.g., source code, configuration files). The space in /home/$USER is very limited, only 500 MB, so use this space wisely. A Unix “shortcut” to a user’s home directory is ~. Home directories are NFS-mounted from our Isilon Storage System.

What to store in your home directory

Consider storing important code or configuration files in your home directory.

Certainly don’t store data or output files in your home directory.

Please be aware that this storage is backed up.


Group Storage

There are two types of shared group storage, both are referred to as “research storage”. Both types are not backed up.

HPCF Research Storage

This is research storage that is available across the UMBC HPCF.

HPCF research storage is split into volumes named after research groups (group-shared). We will say that HPCF research storage volumes are therefore group-specific. The HPCF research storage volumes that you have access to are available via the symbolic links in your home directory. These links are created by default in your home directory.

 

Symbolic Link Actual Location Description
~/${GROUP}_common /nfs/rs/$GROUP/common Shared storage for your research group
~/${GROUP}_user /nfs/rs/$GROUP/users/$USER Personal storage in research storage

 

By default, each research storage volume starts with 250GB of storage. You can use df to determine how much storage is available. To accommodate those users familiar with the taki cluster environment, these HPCF Research Storage links are aliased to /umbc/xfs1 (e.g., /umbc/xfs1/$GROUP/common).

HPCF Research Storage is NFS-mounted from our dedicated HPCF Research Storage Servers, shared by all research computing infrastructure.

What to store in HPCF Research Storage

Consider storing your personal data, model files, or experimental output in your ~/${GROUP}_user directory.

Consider storing group data or code in your ~/${GROUP}_common directory.

Certainly do not store critical data in HPCF Research Storage.

Please be aware that this is shared by everyone in your group.

Please be aware that this storage is not backed up.

ada Research Storage

Users also have access to group-shared research storage volumes separate from their group-shared HPCF research storage volumes. These are the ada research storage volumes. This storage is internal to the ada cluster environment and is not accessible outside of the ada cluster environment.

In the home directory of every user there are two symbolic links to the ada research storage volume associated with the primary group of the user. This links are described below.

Symbolic Link Actual Location Description
~/.conda /nfs/ada/$GROUP/users/$USER/.conda Storage area for a user’s conda files
~/.cache /nfs/ada/$GROUP/users/$USER/.cache Storage area for a user’s cache files
~/${GROUP}_ada /nfs/ada/$GROUP Storage area for ada research storage

The ada Research Storage is NFS-mounted from a file server local to the ada cluster compute hardware.

What to store in ada Research Storage

Consult with your faculty or staff sponsor to determine what should be stored in your ~/${GROUP}_ada directory.

Certainly do not store critical data in HPCF Research Storage.

Please be aware that this is shared by everyone in your group.

Please be aware that this storage is not backed up.


Scratch Storage

Scratch storage is available on compute nodes while jobs are running. The scratch storage is majorly intended to be temporary storage and to provide extremely high I/O throughput to or from disk.

Because scratch space is local to the compute node, it is generally much faster than your /home directory or network filesystem and provides better latency when performing a large number of  I/O operations. This makes it ideal for storing intermediate data files specific to running jobs, reducing the load on system-wide storage and the local network.

It is strongly recommended that users who have large datasets which need to be read into memory, to first copy their relevant data to the /scratch volume, execute the required application by submitting them to SLURM and then redirect the output onto the /scratch storage. Finally, the data that you desire to retain must be moved from /scratch to HPCF permanent storage.

To copy data files from HPCF research storage to your /scratch directory at the beginning of a job, it is necessary to put ‘copy’ commands in the Slurm script and submit it as part of your job. One may also want to copy the output data in /scratch to HPCF permanent storage before the job ends.

You can access it using the following directory referenced by the $SLURM_JOBID environment variable: /scratch/$SLURM_JOBID. Moreover, you can access this directory using $JOB_SCRATCH_DIR environment variable. See the page on SLURM Environment Variables for more information on this.

This directory is created when you start your job. It will also be deleted when your job is finished. This can only be used while your job is running.

How Much Space Is Left?

You only have a limited amount of disk space, and in some cases, you will be sharing disk space with other users. Please be aware of what you are using.

Here are the areas should be aware of:

  • HPCF Research Storage: You can check the available space with df -h /umbc/xfs1/$GROUP
  • ada Research Storage: Just like HPCF Research Storage, use df -h /nfs/ada/$GROUP
  • Home directory: There is limited space available in your home directory. Use the command du -s ~/ to check the amount of space you using. By default, you will only be allocated 500 MB.

[1]: https://linux.die.net/man/5/nfs, Accessed 20210721