Skip to Main Content

System Description

Table of Contents

The taki Cluster

The taki cluster is a heterogeneous cluster with equipment acquired in 2009, 2013, and 2018. It contains over 170 CPU nodes, a GPU cluster with a node containing 4 NVIDIA Tesla V100 GPUs, and an 8-node Big Data cluster. Equipment from 2003, 2007, 2008, and 2010 has been retired. The equipment currently in service in taki can be summarized as

  • HPCF2018:
    • 42 compute and 2 develop nodes, each two 18-core Intel Skylake CPUs
      • Information about the Intel Skylake CPUs can be found here
    • 1 GPU node with 4 NVIDIA Tesla V100 GPUs connected by NVLink
    • 8 Big Data nodes, each two Skylake CPUs and 48 TB disk space
    • 2 login/user nodes, 1 management node
  • HPCF2013:
    • 50 compute and 2 develop nodes, each containing two 8-core Intel Ivy Bridge CPUs
      • Information about the Intel Ivy Bridge CPUs can be found here
    • 19 CPU/GPU nodes, hybrid nodes with 2 CPUs and 2 GPUs
  • HPCF2009:
    • 82 compute and 2 develop nodes, each two 4-core Intel Nehalem CPUs
      • Information about the Intel Nehalem CPUs can be found here

All nodes run the CentOS7 version of the Linux operating system. Only the the bash shell is supported. A schematic of the layout of taki is given below. The system is composed of two major networks, labeled IB-EDR and IB-QDR. For more information about these networks, see the Network Hardware section below. Notice that the HPCF2013 and HPCF2009 nodes are both on the network IB-QDR, while the HPCF2018 nodes are on IB-EDR.

taki-schematic rack-schematic

 

Some photos of the cluster follow here. For a full high resolution image, please click each image.

2018 Compute Front
A6 Compute Nodes Front
2018 Compute Front
A5 Compute Nodes Front
2018 Compute Front
A4 Compute Nodes Front
2018 Compute Front
A6 Compute Nodes Back
2018 Compute Front
A5 Compute Nodes Back
2018 Compute Front
A4 Compute Nodes Back
HPCF2018
HPCF2013_FRONT2
Racks, front
HPCF2013_BACK6
Racks, back
HPCF2013_CLOSE2
Nodes, close up
HPCF2013
TARA_FRONT2
Racks, doors opened
IMG_1829
IB-QDR, front
INFINIBAND_BACK3
IB-QDR, back
HPCF2009

CPU, GPU, and Big Data Clusters in taki

For access, taki is divided into three clusters of distinct types, with components from several dates of purchases. This structure is reflected by the tabs on top of this webpage.

  • CPU Cluster: The 179 nodes total to over 3000 cores and over 20 TB pooled memory!
    • HPCF2018: 42 compute and 2 develop nodes, each with two 18-core Intel Skylake CPUs and 384 GB of memory,
    • HPCF2013: 49 compute and 2 develop nodes, each with two 8-core Intel Ivy Bridge CPUs and 64 GB of memory,
    • HPCF2009: 82 compute and 2 develop nodes, each with two 4-core Intel Nehalem CPUs and 24 GB of memory;
  • GPU Cluster:
    • HPCF2018: 1 GPU node containing 4 NVIDIA Tesla V100 GPUs connected by NVLink and 2 Intel Skylake CPUs,
    • HPCF2013: 18 CPU/GPU nodes, each a hybrid node with 2 CPUs and 2 NVIDIA K20 GPUs;
  • Big Data Cluster:
    • HPCF2018: 8 Big Data nodes, each with 2 CPUs and 48 TB disk space;
  • Other Nodes:
    • 2 login/user nodes (taki-usr1, taki-usr2),
    • 1 management node.

Each of the three clusters requires more details for complete documentation, which is provided in the following three lists.

CPU Cluster:

  • HPCF2018: The 42 compute nodes total to 1512 cores and over 15 TB of pooled memory!
    • 42 compute nodes (cnode002-029,031-044) and 2 develop nodes (cnode001,cnode030), each with two 18-core Intel Xeon Gold 6140 Skylake CPUs (2.3 GHz clock speed, 24.75 MB L3 cache, 6 memory channels, 140 W power), for a total of 36 cores per node,
    • Each node has 384 GB of memory (12 x 32 GB DDR4 at 2666 MT/s) and a 120 GB SSD disk,
    • The nodes are connected by a network of four 36-port EDR (Enhanced Data Rate) InfiniBand switches (100 Gb/s bandwidth, 90 ns latency);
  • HPCF2013: The 49 compute nodes total to 784 cores and over 3 TB of pooled memory.
    • 49 compute nodes and 2 develop nodes, each with two 8-core Intel E5-2650v2 Ivy Bridge CPUs (2.6 GHz clock speed, 20 MB L3 cache, 4 memory channels), for a total of 16 cores per node,
    • Each node has 64 GB of memory (8 x 8 GB DDR3) and 500 GB of local hard drive,
    • The nodes are connected by a QDR (quad-data rate) InfiniBand switch;
  • HPCF2009: The 82 compute nodes total to 656 cores and nearly 2 TB of pooled memory.
    • 82 compute nodes and 2 develop nodes, each with two 4-core Intel Nehalem X5550 CPUs (2.6 GHz clock speed, 8 MB L3 cache, 3 memory channels), for a total of 8 cores per node,
    • Each node has 24 GB of memory (6 x 4 GB DDR3) and a 120 GB local hard drive,
    • The nodes are connected by a QDR (quad-data rate) InfiniBand switch.

GPU Cluster:

  • HPCF2018:
    • 1 GPU node containing four NVIDIA Tesla V100 GPUs (TMP more details needed: cores, memory) connected by NVLink and two 18-core Intel Skylake CPUs,
    • The node has 384 GB of memory (12 x 32 GB DDR4 at 2666 MT/s) and a 120 GB SSD disk,
  • HPCF2013:
    • 18 hybrid CPU/GPU nodes, each two NVIDIA K20 GPUs (2496 computational cores, 5 GB onboard memory) and two 8-core Intel E5-2650v2 Ivy Bridge CPUs (2.6 GHz clock speed, 20 MB L3 cache, 4 memory channels),
    • Each node has 64 GB of memory (8 x 8 GB DDR3) and 500 GB of local hard drive,
    • The nodes are connected by a QDR (quad-data rate) InfiniBand switch;

Big Data Cluster:

  • HPCF2018:
    • 8 Big Data nodes, each with two 18-core Intel Xeon Gold 6140 Skylake CPUs (2.3 GHz clock speed, 24.75 MB L3 cache, 6 memory channels, 140 W power), for a total of 36 cores per node,
    • Each node has 384 GB of memory (12 x 32 GB DDR4 at 2666 MT/s) and 48 TB (12 x 4 TB) SATA hard disks,
    • The nodes are connected by a 10 Gb/s Ethernet network.

The following schematics show the architecture of the CPUs and GPUs for the HPCF2013. The first schematic shows one of the compute nodes that consists of two eight-core 2.6 GHz Intel E5-2650v2 Ivy Bridge CPUs. Each core of each CPU has dedicated 32 kB of L1 and 256 kB of L2 cache. All cores of each CPU share 20 MB of L3 cache. The 64 GB of the node’s memory is the combination of eight 8 GB DIMMs, four of which are connected to each CPU. The two CPUs of a node are connected to each other by two QPI (quick path interconnect) links. Nodes are connected by a quad-data rate InfiniBand interconnect.

The NVIDIA K20 is a powerful general purpose graphics processing unit (GPGPU) with 2496 computational cores which is designed for efficient double-precision calculation. GPU accelerated computing has become popular in recent years due to the GPU’s ability to achieve high performance in computationally intensive portions of code beyond a general purpose CPU. The NVIDIA K20 GPU has 5 GB of onboard memory.

Types of Nodes

The taki cluster contains several types of nodes that fall into four main categories for usage.

  • Management node – There is one management node, which is reserved for administration of the cluster. It is not available to users.
  • User nodes – Users work on these nodes directly. This is where users log in, access files, write and compile code, and submit jobs to the scheduler from to be run on the compute nodes. Furthermore, these are the only nodes that may be accessed via SSH/SCP from outside of the cluster.
  • Compute nodes – These nodes are where the majority of computing on the cluster will take place. Users normally do not interact with these nodes directly. Instead jobs are created and submitted from the user nodes and a program called the scheduler decides which compute resources are available to run the job.In principle a user could connect to compute nodes directly from a user node by SSH (e.g. “ssh n84”). However, SSH access to the compute nodes is disabled to help maintain stability of the cluster. Compute nodes should instead be allocated by the scheduler. Advanced users should be very careful when doing things such as spawning child processes or threads – you must ensure that you are only using the resources allocated to you by the scheduler. It is best to contact HPCF user support if there are questions about how to set up a non-standard job on maya.
  • Development nodes – These are special compute nodes which are dedicated to running code that is under development. This allows users to test their programs and not interfere with programs running in production. Programs are also limited to a short maximum run time on these nodes to make sure users don’t need to wait too long before a program will run. Development jobs are expected to be small in scale, but rerun frequently as you work on your code.The availability of two development nodes allows you to try several useful configurations: single core, several cores on one processor, several cores on multiple processors of one machine, all cores on multiple machines, etc.

The taki cluster features several different computing environments. We ask that for serial CPU-only jobs, users should use the HPCF2009 nodes and HPCF2013 nodes, since these have equivalent core-by-core performance as HPCF2018. Only properly parallelized code can take full advantage of the nodes with 36 cores in HPCF2018. The scheduler is set up to help enforce this automatically.

 

Storage

There are a few special storage systems attached to the clusters, in addition to the standard Unix filesystem. Here we describe the areas which are relevant to users.

Home directory
Each user has a home directory on the /home partition. This partition is 200 GB, and its data is backed up by DoIT. Since the partition is fairly small, users can only store 300 MB of data in their home directory.

Scratch Space
All compute nodes have a local /scratch directory, and it is generally about 100 GB in size. Slurm creates a folder in this directory. Slurm sets ‘JOB_SCRATCH_DIR’ to the directory for the duration of the job. Users have access to this folder only for the duration of their job. After jobs are completed, this space is purged. This space is shared between all users, but your data is accessible only to you. This space is safer to use for jobs than the usual /tmp directory, since critical system processes also require use of /tmp.

Tmp Space
All nodes (including the user node) have a local /tmp directory and it is generally 40 GB in size. Any files in a machines /tmp directory are only visible on that machine and the system deletes them once in a while. Furthermore, the space is shared between all users. It is preferred that users make use of Scratch Space over /tmp whenever possible.

Research Storage
Users are also given space on this storage area which can be accessed from anywhere on the cluster. There is space in research storage available for user and group storage. These are accessible via symbolic links in users’ home directories. Note that this area is not backed up.

UMBC AFS Storage Access
Your AFS partition is the directory where your personal files are stored when you use the DoIT computer labs or the gl.umbc.edu login nodes. The UMBC-wide /afs can be accessed from the taki login nodes.