Debugging an Active Process

To get the job state and what nodes your process is running on, use the command squeue -j <jobID>  . Here jobID is the id number for the process you are looking to debug.

Example Using squeue
maya$  squeue -j 4583521
            JOBID PARTITION     NAME     USER    ST       TIME  NODES NODELIST(REASON)
          4583521     batch    matlab    jcorn2  R       17:18      1 n67

 

You can check on the status of the process by using ssh to get into each of the nodes your process is running on. Once successfully ssh onto the node, the following code can be used to attach a debugger to your active process.

How to Attach a Debugger
n67$ ps aux | grep jcorn2
jcorn2  17668  0.0  0.0 106108  1224 ?        S    11:09   0:00 /bin/bash /cm/local/apps/slurm/var/spool/job4583521/slurm_script
jcorn2  17672  0.0  0.0 502220  4836 ?        Sl   11:09   0:00 srun matlab
jcorn2  17673  0.0  0.0  26996   704 ?        S    11:09   0:00 srun matlab
jcorn2  17704  0.0  0.0 112476 11076 ?        SL   11:09   0:00 /home/jcorn2/bin/
jcorn2  17705  0.0  0.0 112476 11064 ?        SL   11:09   0:00 /home/jcorn2/bin/
jcorn2  17706  0.0  0.0 112476 11064 ?        SL   11:09   0:00 /home/jcorn2/bin/
jcorn2  17707  0.0  0.0 112476 11064 ?        SL   11:09   0:00 /home/jcorn2/bin/
root    26975  0.0  0.0 103308   868 pts/0        S+   11:54   0:00 grep jcorn2
n67$ strace -p 17705
Process 17705 attached
accept(3, ^CProcess 17705 detached
<detached ...>