- Checking the top command
- Checking the proc filesystem
- Checking memory from a serial C program
- Checking memory from a parallel C program
In just about any computing activity, it’s important ensure that your programs are using memory efficiently. This is especially crucial in high performance computing, where your problem may be so large that it won’t fit on a single machine, or even a few machines. In this page, we’ll have a look at how to monitor memory usage on the cluster.
Checking the top command
The easiest way to check the memory usage of a running process is to use the interactive “top” command. At the command line, try running
[araim1@maya-usr1 ~]$ top
You’ll probably get a long list of processes as below, most of which you aren’t interested in. You’ll also see some interesting numbers like free memory, swap space used, and percent CPU currently utilized. Each process has several memory statistics shown. The most conservative one is VIRT, which includes code, data, and virtual memory. The one that probably reflects our actual usage the most is RES, which only includes code and data. These two values together give us a good idea of our usage. The top display automatically updates itself every few seconds. For more information, see the top manual page (“man top”).
top - 01:19:53 up 79 days, 12:52, 4 users, load average: 0.00, 0.00, 0.00 Tasks: 232 total, 1 running, 230 sleeping, 0 stopped, 1 zombie Cpu(s): 0.0%us, 0.0%sy, 0.0%ni, 99.5%id, 0.4%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 49433548k total, 35805616k used, 13627932k free, 747232k buffers Swap: 8385888k total, 0k used, 8385888k free, 32890968k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 28336 araim1 15 0 10976 1148 776 R 0.3 0.0 0:00.02 top 1 root 15 0 10348 696 584 S 0.0 0.0 0:01.45 init 2 root RT -5 0 0 0 S 0.0 0.0 0:01.91 migration/0 3 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0 4 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/0 5 root RT -5 0 0 0 S 0.0 0.0 0:01.39 migration/1 6 root 34 19 0 0 0 S 0.0 0.0 0:00.05 ksoftirqd/1 7 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/1 8 root RT -5 0 0 0 S 0.0 0.0 0:01.04 migration/2 9 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/2 10 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/2 11 root RT -5 0 0 0 S 0.0 0.0 0:01.66 migration/3 12 root 34 19 0 0 0 S 0.0 0.0 0:00.01 ksoftirqd/3 13 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/3 14 root RT -5 0 0 0 S 0.0 0.0 0:07.36 migration/4 15 root 34 19 0 0 0 S 0.0 0.0 0:00.01 ksoftirqd/4 16 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/4 17 root RT -5 0 0 0 S 0.0 0.0 0:00.50 migration/5 18 root 34 19 0 0 0 S 0.0 0.0 0:00.15 ksoftirqd/5 19 root RT -5 0 0 0 S 0.0 0.0 0:00.00 watchdog/5 20 root RT -5 0 0 0 S 0.0 0.0 0:00.12 migration/6 21 root 34 19 0 0 0 S 0.0 0.0 0:00.01 ksoftirqd/6
We can narrow the list down to just our processes. Type “u”, then your username, then enter. You’ll get a shorter list that looks something like this:
top - 01:30:57 up 79 days, 13:03, 4 users, load average: 0.00, 0.00, 0.00 Tasks: 232 total, 1 running, 230 sleeping, 0 stopped, 1 zombie Cpu(s): 0.0%us, 0.0%sy, 0.0%ni, 99.9%id, 0.1%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 49433548k total, 35805656k used, 13627892k free, 747232k buffers Swap: 8385888k total, 0k used, 8385888k free, 32891308k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 28336 araim1 15 0 10976 1152 780 R 0.3 0.0 0:01.47 top 25694 araim1 15 0 99196 1768 976 S 0.0 0.0 0:02.60 sshd 25695 araim1 15 0 66260 3672 1184 S 0.0 0.0 0:01.67 bash
One more useful thing we’ll mention here – it’s possible to toggle the status of individual CPU cores on/off by typing “1”
top - 01:32:09 up 79 days, 13:05, 4 users, load average: 0.00, 0.00, 0.00 Tasks: 232 total, 1 running, 230 sleeping, 0 stopped, 1 zombie Cpu0 : 0.0%us, 0.1%sy, 0.0%ni, 99.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 0.1%us, 0.1%sy, 0.0%ni, 99.5%id, 0.2%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 : 0.1%us, 0.1%sy, 0.0%ni, 99.5%id, 0.4%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 : 0.2%us, 0.1%sy, 0.0%ni, 99.5%id, 0.2%wa, 0.0%hi, 0.0%si, 0.0%st Cpu4 : 0.1%us, 0.0%sy, 0.0%ni, 99.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu5 : 0.1%us, 0.3%sy, 0.0%ni, 99.6%id, 0.1%wa, 0.0%hi, 0.0%si, 0.0%st Cpu6 : 0.0%us, 0.0%sy, 0.0%ni, 99.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu7 : 0.1%us, 0.0%sy, 0.0%ni, 99.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 49433548k total, 35805832k used, 13627716k free, 747232k buffers Swap: 8385888k total, 0k used, 8385888k free, 32891340k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 25694 araim1 15 0 99196 1768 976 S 0.0 0.0 0:02.60 sshd 25695 araim1 15 0 66260 3672 1184 S 0.0 0.0 0:01.67 bash 28336 araim1 15 0 10976 1152 780 R 0.0 0.0 0:01.61 top
The issue with top is that it’s interactive. When we’re running our high performance parallel code, we may want to log the memory usage at some very specific times. For example, when we finish allocating a large data structure. We’d prefer not to have to watch the top command and track things manually.
Checking the proc filesystem
Let’s take one step in the direction of automating memory checking. To do this, we’ll use the proc filesystem. This is a special filesystem on Unix machines which contains information about the system. We’ll try a few commands to get a feel for it. Here is information about the CPU cores on the front end node. This is fairly static information which we do not expect to change much.
[araim1@maya-usr1 ~]$ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 26 model name : Intel(R) Xeon(R) CPU X5550 @ 2.67GHz stepping : 5 cpu MHz : 1596.000 cache size : 8192 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 4 apicid : 0 fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx rdtscp lm constant_tsc ida nonstop_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr sse4_1 sse4_2 popcnt lahf_lm bogomips : 5333.69 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: [8] ... (7 other cores are displayed as well) ...
We can also see top level memory information, such as how much memory is free and how much swap space is being used. This information is more dynamic, and is changing constantly.
[araim1@maya-usr1 ~]$ cat /proc/meminfo MemTotal: 49433548 kB MemFree: 13626952 kB Buffers: 747232 kB Cached: 32891628 kB SwapCached: 0 kB Active: 4659696 kB Inactive: 29089212 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 49433548 kB LowFree: 13626952 kB SwapTotal: 8385888 kB SwapFree: 8385888 kB Dirty: 136 kB Writeback: 0 kB AnonPages: 109404 kB Mapped: 18788 kB Slab: 1922440 kB PageTables: 11608 kB NFS_Unstable: 0 kB Bounce: 0 kB CommitLimit: 33102660 kB Committed_AS: 305584 kB VmallocTotal: 34359738367 kB VmallocUsed: 270516 kB VmallocChunk: 34359467755 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 Hugepagesize: 2048 kB
We can also check the memory usage of a specific process. Try “cat /proc/<PID>/status” to get information about a process with a given PID (process ID). We can check “cat /proc/self/status” to the get information about the current process.
[araim1@maya-usr1 check_memory_parallel]$ cat /proc/self/status Name: cat State: R (running) SleepAVG: 88% Tgid: 28665 Pid: 28665 PPid: 25695 TracerPid: 0 Uid: 28398 28398 28398 28398 Gid: 1057 1057 1057 1057 FDSize: 256 Groups: 700 701 1057 32296 1104637136 VmPeak: 58904 kB VmSize: 58904 kB VmLck: 0 kB VmHWM: 476 kB VmRSS: 476 kB VmData: 164 kB VmStk: 84 kB VmExe: 20 kB VmLib: 1444 kB VmPTE: 40 kB StaBrk: 04eed000 kB Brk: 04f0e000 kB StaStk: 7fff9cdf3b70 kB Threads: 1 SigQ: 0/409600 SigPnd: 0000000000000000 ShdPnd: 0000000000000000 SigBlk: 0000000000000000 SigIgn: 0000000000000000 SigCgt: 0000000000000000 CapInh: 0000000000000000 CapPrm: 0000000000000000 CapEff: 0000000000000000 Cpus_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,0000ffff Mems_allowed: 00000000,00000003
Notice that we’re getting information about the “cat” command, which is the “self” when we run “cat /proc/self/status” directly from the command line. Earlier when we ran the top command, we looked at the VIRT and RES columns. From the display above, we can get the same information from the VmSize and VmRSS fields, respectively.
[araim1@maya-usr1 ~]$ cat /proc/self/status | egrep 'VmSize|VmRSS' VmSize: 58908 kB VmRSS: 468 kB
Next we will show how to gather this information from a C function.
Checking memory from a serial C program
The following function reads the file “/proc/self/status”, and parses out the numbers in the VmSize and VmRSS fields. Now “self” will refer to the C program that’s invoking this function.
#include "memory.h" /* * Look for lines in the procfile contents like: * VmRSS: 5560 kB * VmSize: 5560 kB * * Grab the number between the whitespace and the "kB" * If 1 is returned in the end, there was a serious problem * (we could not find one of the memory usages) */ int get_memory_usage_kb(long* vmrss_kb, long* vmsize_kb) { /* Get the the current process' status file from the proc filesystem */ FILE* procfile = fopen("/proc/self/status", "r"); long to_read = 8192; char buffer[to_read]; int read = fread(buffer, sizeof(char), to_read, procfile); fclose(procfile); short found_vmrss = 0; short found_vmsize = 0; char* search_result; /* Look through proc status contents line by line */ char delims[] = "\n"; char* line = strtok(buffer, delims); while (line != NULL && (found_vmrss == 0 || found_vmsize == 0) ) { search_result = strstr(line, "VmRSS:"); if (search_result != NULL) { sscanf(line, "%*s %ld", vmrss_kb); found_vmrss = 1; } search_result = strstr(line, "VmSize:"); if (search_result != NULL) { sscanf(line, "%*s %ld", vmsize_kb); found_vmsize = 1; } line = strtok(NULL, delims); } return (found_vmrss == 1 && found_vmsize == 1) ? 0 : 1; }
Download: ../code/check_memory_serial/memory.c
Here’s the corresponding header file
#include <stdlib.h> #include <stdio.h> #include <string.h> int get_memory_usage_kb(long* vmrss_kb, long* vmsize_kb);
Download: ../code/check_memory_serial/memory.h
Here’s a small program to test our function. It allocates 20 large buffers, reporting memory usage each time.
#include "memory.h" int main() { int n = 20; int entrySize = 10000000; int* buffer[n]; long vmrss, vmsize; for (int i = 0; i < n; i++) { buffer[i] = malloc( entrySize * sizeof(int) ); if (!buffer[i]) { printf("Couldn't allocate memory!\n"); exit(1); } for (int j = 0; j < entrySize; j++) { buffer[i][j] = 0; } get_memory_usage_kb(&vmrss, &vmsize); printf("%2d: Current memory usage: VmRSS = %6ld KB, VmSize = %6ld KB\n", i, vmrss, vmsize); } return 0; }
Finally, here is a simple Makefile to compile the test program
PROGNAME := check_memory main: memory.h mpicc $(PROGNAME).c memory.c -o $(PROGNAME) clean: rm -f $(PROGNAME) *.o
Download: ../code/check_memory_serial/Makefile
Building and running the code produces output like this
[araim1@maya-usr1 check_memory_serial]$ make mpicc check_memory.c memory.c -o check_memory check_memory.c: memory.c: [araim1@maya-usr1 check_memory_serial]$ ./check_memory 0: Current memory usage: VmRSS = 40260 KB, VmSize = 68052 KB 1: Current memory usage: VmRSS = 79432 KB, VmSize = 107120 KB 2: Current memory usage: VmRSS = 118492 KB, VmSize = 146180 KB 3: Current memory usage: VmRSS = 157556 KB, VmSize = 185244 KB 4: Current memory usage: VmRSS = 196620 KB, VmSize = 224304 KB 5: Current memory usage: VmRSS = 235680 KB, VmSize = 263368 KB 6: Current memory usage: VmRSS = 274744 KB, VmSize = 302432 KB 7: Current memory usage: VmRSS = 313804 KB, VmSize = 341492 KB 8: Current memory usage: VmRSS = 352868 KB, VmSize = 380556 KB 9: Current memory usage: VmRSS = 391932 KB, VmSize = 419620 KB 10: Current memory usage: VmRSS = 430992 KB, VmSize = 458680 KB 11: Current memory usage: VmRSS = 470056 KB, VmSize = 497744 KB 12: Current memory usage: VmRSS = 509120 KB, VmSize = 536804 KB 13: Current memory usage: VmRSS = 548180 KB, VmSize = 575868 KB 14: Current memory usage: VmRSS = 587244 KB, VmSize = 614932 KB 15: Current memory usage: VmRSS = 626304 KB, VmSize = 653992 KB 16: Current memory usage: VmRSS = 665368 KB, VmSize = 693056 KB 17: Current memory usage: VmRSS = 704432 KB, VmSize = 732120 KB 18: Current memory usage: VmRSS = 743492 KB, VmSize = 771180 KB 19: Current memory usage: VmRSS = 782556 KB, VmSize = 810244 KB [araim1@maya-usr1 check_memory_serial]$
Checking memory from a parallel C program
We’ve seen how to check memory usage for a single process, but what about and MPI job with multiple processes? Let’s suppose we want to see the usage for each process, as well as the total (sum) across all processes. We’ll use the serial function from the previous section, and gather the results into an array on a single process (with ID “root”). We’ll also make a simple helper function that sums over this array (with the result being stored in process 0).
#include "memory_parallel.h" int get_cluster_memory_usage_kb(long* vmrss_per_process, long* vmsize_per_process, int root, int np) { long vmrss_kb; long vmsize_kb; int ret_code = get_memory_usage_kb(&vmrss_kb, &vmsize_kb); if (ret_code != 0) { printf("Could not gather memory usage!\n"); return ret_code; } MPI_Gather(&vmrss_kb, 1, MPI_UNSIGNED_LONG, vmrss_per_process, 1, MPI_UNSIGNED_LONG, root, MPI_COMM_WORLD); MPI_Gather(&vmsize_kb, 1, MPI_UNSIGNED_LONG, vmsize_per_process, 1, MPI_UNSIGNED_LONG, root, MPI_COMM_WORLD); return 0; } int get_global_memory_usage_kb(long* global_vmrss, long* global_vmsize, int np) { long vmrss_per_process[np]; long vmsize_per_process[np]; int ret_code = get_cluster_memory_usage_kb(vmrss_per_process, vmsize_per_process, 0, np); if (ret_code != 0) { return ret_code; } *global_vmrss = 0; *global_vmsize = 0; for (int i = 0; i < np; i++) { *global_vmrss += vmrss_per_process[i]; *global_vmsize += vmsize_per_process[i]; } return 0; }
Here’s the corresponding header file
#include <mpi.h> #include <memory.h> int get_cluster_memory_usage_kb(long* vmrss_per_process, long* vmsize_per_process, int root, int np); int get_global_memory_usage_kb(long* global_vmrss, long* global_vmsize, int np);
Here is a program to test our functions. This time, for simplicity, we allocate only one vector on each process. Then we print out the memory information.
#include <mpi.h> #include <stdio.h> #include <unistd.h> #include "memory_parallel.h" int main (int argc, char *argv[]) { int id, np; char processor_name[MPI_MAX_PROCESSOR_NAME]; char hostname[MPI_MAX_PROCESSOR_NAME]; int processor_name_len; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &np); MPI_Comm_rank(MPI_COMM_WORLD, &id); MPI_Get_processor_name(processor_name, &processor_name_len); printf("Number_of_processes=%03d, My_rank=%03d, processor_name=%5s\n", np, id, processor_name); int entrySize = 1000000 + id * 100000; long* l_buffer[entrySize]; for (int j = 0; j < entrySize; j++) { l_buffer[j] = 0; } long vmrss_per_process[np]; long vmsize_per_process[np]; get_cluster_memory_usage_kb(vmrss_per_process, vmsize_per_process, 0, np); if (id == 0) { for (int k = 0; k < np; k++) { printf("Process %03d: VmRSS = %6ld KB, VmSize = %6ld KB\n", k, vmrss_per_process[k], vmsize_per_process[k]); } } long global_vmrss, global_vmsize; get_global_memory_usage_kb(&global_vmrss, &global_vmsize, np); if (id == 0) { printf("Global memory usage: VmRSS = %6ld KB, VmSize = %6ld KB\n", global_vmrss, global_vmsize); } MPI_Finalize(); return 0; }
Here is the Makefile
PROGNAME := check_memory main: memory.h memory_parallel.h mpicc $(PROGNAME).c memory.c memory_parallel.c -o $(PROGNAME) clean: rm -f $(PROGNAME) *.o
Download: ../code/check_memory_parallel/Makefile
And here is a simple SLURM script to run the program
#!/bin/bash #SBATCH --job-name=MPI_check_memory #SBATCH --output=slurm.out #SBATCH --error=slurm.err #SBATCH --partition=develop #SBATCH --nodes=2 #SBATCH --ntasks-per-node=2 srun ./check_memory
Make sure to obtain memory.c and memory.h from the previous section as well. Building and running the code yields the following result.
[araim1@maya-usr1 check_memory_parallel]$ make mpicc check_memory.c memory.c memory_parallel.c -o check_memory check_memory.c: memory.c: memory_parallel.c: [araim1@maya-usr1 check_memory_parallel]$ sbatch mvapich2.slurm sbatch: Submitted batch job 1581 [araim1@maya-usr1 check_memory_parallel]$ cat slurm.out Number_of_processes=004, My_rank=003, processor_name= n2 Number_of_processes=004, My_rank=001, processor_name= n1 Number_of_processes=004, My_rank=000, processor_name= n1 Process 000: VmRSS = 22996 KB, VmSize = 77268 KB Process 001: VmRSS = 19624 KB, VmSize = 78048 KB Process 002: VmRSS = 24552 KB, VmSize = 78828 KB Process 003: VmRSS = 21196 KB, VmSize = 79612 KB Global memory usage: VmRSS = 88752 KB, VmSize = 314052 KB Number_of_processes=004, My_rank=002, processor_name= n2 [araim1@maya-usr1 check_memory_parallel]$
Notice that the global memory usage reported in the output is much higher than the sum of per-process usages that are reported immediately prior. This is coming from the usage of the MPI library itself, which is allocating some memory as we use it. We could verify in our test program that if we call both memory functions a second time, the numbers would remain steady.