Checking which CPUs are used by your program

In the how to run tutorial, we discussed how to select the number of nodes and processes per node for a job. It may also be of interest to find out which processor cores are in use, although this is also more difficult to control.

Why would this be of interest? For example, on an HPCF2013 node there are two physical CPUs with eight cores each. A memory intensive job requiring two processes may have vastly different performance if the processes share a physical CPU, than if they are each allocated to their own CPU. On this page, we discuss ways of reporting the identity of the host CPU for our processes.

Note that the discussion on this page is specific to Unix, and might possibly be different on other Unix systems than ours. We make use of the proc filesystem, which contains special files that describe static things like hardware, as well as dynamic things like the current running processes.

Which CPUs are available?

The special “/proc/cpuinfo” file can be queried to get information about which CPUs are available on the current node. We do not expect this information to change over time on a given node. The following output is obtained on maya-usr1.

[araim1@maya-usr1 ~]$ cat /proc/cpuinfo 
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 62
model name	: Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
stepping	: 4
cpu MHz		: 2599.948
cache size	: 20480 KB
physical id	: 0
siblings	: 8
core id		: 0
cpu cores	: 8
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb
rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc
aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr
pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx
f16c rdrand lahf_lm ida arat xsaveopt pln pts dts tpr_shadow vnmi flexpriority
ept vpid fsgsbase smep erms
bogomips	: 5199.89
clflush size	: 64
cache_alignment	: 64
address sizes	: 46 bits physical, 48 bits virtual
power management:

... 15 other cores are displayed in the same format ...
[araim1@maya-usr1 ~]$

We can see that there are 16 distinct “processors” with processor IDs 0-15. For each processor, the field “physical id” is either 0 or 1, corresponding to one of two physical CPUs. Notice that here, processors with even IDs reside on the CPU with physical ID 0, while processors with odd IDs reside on the CPU with physical ID 1. Other interesting information is available as well, such as the model name, clock speed, and cache size.

Checking processor ID for the current process

For a running process, we would like to know which processor ID is currently in use. This can vary over the lifetime of a program, but the last known processor ID can be obtained by querying another special file called “/proc/self/stat”. Note that the directory “/proc/self” contains special files with information about the current process; instead of “self” we could also provide a process ID (pid) for another running program.

[araim1@maya-usr1 ~]$ cat /proc/self/stat
48982 (cat) R 9744 48982 9744 34818 48982 8192 185 0 0 0 0 0 0 0 20 0 1 0
137436614 103354368 134 18446744073709551615 4194304 4235780 140737488346592
140737488343784 252896458544 0 0 0 0 0 0 0 17 2 0 0 0 0 0
[araim1@maya-usr1 ~]$

This is difficult to read because the fields are not labelled, but the 6th-to-last field “2” represents the processor ID. If we count from the beginning, we see that this is the 39th field. We can grab this field programmatically using the “gawk” command.

[araim1@maya-usr1 ~]$ CPU_ID=$(cat /proc/self/stat)
[araim1@maya-usr1 ~]$ echo $CPU_ID | gawk '{print $39}'
2
[araim1@maya-usr1 ~]$

Next we will show how to gather this information from a C function, so that we can gather information directly from a running program.

Checking processor ID from a C program

The following function reads the file “/proc/self/stat”, and parses the processor ID as we have done above.

#include "getcpuid.h"
#include <stdlib.h>
#include <stdio.h>
#include <string.h>

/*
* This code is adapted from an example at:
* http://brokestream.com/procstat.html
*/

int get_cpu_id()
{
    /* Get the the current process' stat file from the proc filesystem */
    FILE* procfile = fopen("/proc/self/stat", "r");
    long to_read = 8192;
    char buffer[to_read];
    int read = fread(buffer, sizeof(char), to_read, procfile);
    fclose(procfile);

    // Field with index 38 (zero-based counting) is the one we want
    char* line = strtok(buffer, " ");
    for (int i = 1; i < 38; i++)
    {
        line = strtok(NULL, " ");
    }

    line = strtok(NULL, " ");
    int cpu_id = atoi(line);
    return cpu_id;
}


Download: ../code/get-cpu-id/getcpuid.c

Here’s the corresponding header file

#ifndef GETCPUID_H
#define GETCPUID_H

int get_cpu_id();

#endif

Download: ../code/get-cpu-id/getcpuid.h

Here’s a small program to test our function. It’s more interesting to try this with a parallel program than a serial program, so let’s use MPI and have each process report their hostname and processor ID.

#include <mpi.h>
#include <stdio.h>
#include "getcpuid.h"

int main(int argc, char** argv)
{
    int id, np;
    char processor_name[MPI_MAX_PROCESSOR_NAME];
    int processor_name_len;

    MPI_Init(&argc, &argv);

    MPI_Comm_size(MPI_COMM_WORLD, &np);
    MPI_Comm_rank(MPI_COMM_WORLD, &id);
    MPI_Get_processor_name(processor_name, &processor_name_len);

    int cpu_id = get_cpu_id();
    printf("Hello from process %03d out of %03d, hostname %s, cpu_id %d\n", 
        id, np, processor_name, cpu_id);

    MPI_Finalize();
    return 0;
}


Download: ../code/get-cpu-id/main.c

Here is a simple Makefile to compile the test program

PROGNAME := get_cpu

main: getcpuid.h
    mpicc -std=c99 main.c getcpuid.c -o $(PROGNAME)

clean:
    rm -f $(PROGNAME) *.o


Download: ../code/get-cpu-id/Makefile

Let’s try the test program with the following batch script

#!/bin/bash
#SBATCH --job-name=getcpu_parallel
#SBATCH --output=slurm.out
#SBATCH --error=slurm.err
#SBATCH --partition=batch
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=8
#SBATCH --constraint=hpcf2013

srun ./get_cpu

Download: ../code/get-cpu-id/run.slurm

Building and running the code produces the following output.

[araim1@maya-usr1 get-cpu-id]$ make
mpicc -std=c99 main.c getcpuid.c -o get_cpu
[araim1@maya-usr1 get-cpu-id]$ sbatch run.slurm 
Submitted batch job 9216
[araim1@maya-usr1 get-cpu-id]$ cat slurm.out 
Hello from process 001 out of 016, hostname n68, cpu_id 2
Hello from process 002 out of 016, hostname n68, cpu_id 4
Hello from process 003 out of 016, hostname n68, cpu_id 6
Hello from process 004 out of 016, hostname n68, cpu_id 8
Hello from process 005 out of 016, hostname n68, cpu_id 10
Hello from process 006 out of 016, hostname n68, cpu_id 12
Hello from process 000 out of 016, hostname n68, cpu_id 0
Hello from process 009 out of 016, hostname n69, cpu_id 2
Hello from process 010 out of 016, hostname n69, cpu_id 4
Hello from process 011 out of 016, hostname n69, cpu_id 6
Hello from process 012 out of 016, hostname n69, cpu_id 8
Hello from process 007 out of 016, hostname n68, cpu_id 14
Hello from process 008 out of 016, hostname n69, cpu_id 0
Hello from process 013 out of 016, hostname n69, cpu_id 10
Hello from process 014 out of 016, hostname n69, cpu_id 12
Hello from process 015 out of 016, hostname n69, cpu_id 14
[araim1@maya-usr1 get-cpu-id]$ 

Notice that only the even numbered processor IDs have been used, which indicates that all eight cores on each node we running on the CPU with physical ID 0.

 

Checking processor ID from a C program with OpenMP Multithreading

The following function obtains the processor id for each process as we have done above and also provides the processor id for each thread using the sched_getcpu() function.

nodesused.c

Here is the Makefile to compile the test program
Makefile

run.slurm

Building and running the code produces the following output

[khsa1@maya-usr1 cpu-id]$ make
[khsa1@maya-usr1 cpu-id]$ sbatch run.slurm
Submitted batch job 1710841
[khsa1@maya-usr1 cpu-id]$ cat nodesused.log
MPI process 0000 of 0002 on cpu_id 00 of node   n1
MPI process 0000 of 0002 thread 00 of 04 on cpu_id 06 of node   n1
MPI process 0000 of 0002 thread 01 of 04 on cpu_id 04 of node   n1
MPI process 0000 of 0002 thread 02 of 04 on cpu_id 02 of node   n1
MPI process 0000 of 0002 thread 03 of 04 on cpu_id 00 of node   n1
MPI process 0001 of 0002 on cpu_id 08 of node   n1
MPI process 0001 of 0002 thread 00 of 04 on cpu_id 14 of node   n1
MPI process 0001 of 0002 thread 01 of 04 on cpu_id 08 of node   n1
MPI process 0001 of 0002 thread 02 of 04 on cpu_id 10 of node   n1
MPI process 0001 of 0002 thread 03 of 04 on cpu_id 12 of node   n1

The code produces a separate log file for each MPI process and OpenMP thread. The last three lines of the slurm script perform post-processing, sorting and concatenating the log files into a single nodesused.log and then deleting all extraneous log files.