How to run MrBayes on maya

Introduction

MrBayes is a program for Bayesian inference and model choice across a wide range of phylogenetic and evolutionary models. To run the MrBayes software interactively on the front-end node of maya, use the command

[araim1@maya-usr1 ~]$ mb

We will now demonstrate running MrBayes on the compute nodes. This was adpated from MbWiki. First, make sure to update your switcher settings as follows

[araim1@maya-usr1 ~]$ switcher mpi = gcc-mvapich2-1.4rc2
[araim1@maya-usr1 ~]$ switcher_reload

This is necessary to match the compiler and MPI implementation originally used to configure MrBayes. Next create a .nex file as follows, with your contents instead of the placeholders.

#NEXUS

BEGIN DATA;
    ... Insert data here ...
END;

begin mrbayes;
    ... Insert MrBayes block contents here
END;


Download: ../code/mrbayes-example/my.nex

Now create a small script with commands to execute the .nex file

set autoclose=yes nowarn=yes
execute my.nex
quit


Download: ../code/mrbayes-example/batch.txt

Note that there are other ways to set up a call to MrBayes; for example, MCMC and likelihood options can be specified in a file separate from the data. The last step is to create a usual SLURM batch script. This script will run the MrBayes program on the compute nodes; the program will be run in parallel if multiple processes are requested. MyBayes will then run batch.txt, which will in turn execute our .nex file.

#!/bin/bash
#SBATCH --job-name=mrbayes
#SBATCH --output=slurm.out
#SBATCH --error=slurm.err
#SBATCH --partition=batch
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=8

srun mb batch.txt 


Download: ../code/mrbayes-example/run.slurm

Make sure you have read the how to run tutorial before attempting to use the batch system.

Setting checkpoints for long runs

For users needing very long runs of MrBayes, it is suggested to break up the work into several small jobs rather than one very long job. Long jobs have a higher probably of being interrupted by maintenance windows or unforeseen problems. Fortunately, MrBayes has a built in mechanism for creating checkpoints, where progress can be saved from one job and continued in a subsequent job.

To demonstrate this, consider the “primates.nex” example that comes with the MrBayes software. We will create two MB run scripts to analyze this data. The first script represents the initial run.

execute primates.nex;

mcmc ngen=10000000 nruns=2 temp=0.02 mcmcdiag=yes samplefreq=1000 
stoprule=yes stopval=0.005 relburnin=yes burninfrac=0.1 printfreq=1000 
checkfreq=1000;

Download: ../code/mrbayes-checkpoint/cmds1.nex

Notice that we set “checkfreq”, which represents the number of generations before checkpointing. The second script continues where the first script left off.

execute primates.nex;

mcmc ngen=20000000 nruns=2 temp=0.02 mcmcdiag=yes samplefreq=1000
stoprule=yes stopval=0.005 relburnin=yes burninfrac=0.1 printfreq=1000
append=yes checkfreq=1000;

Download: ../code/mrbayes-checkpoint/cmds2.nex

The only differences are the addition of the option “append=yes”, which tells MrBayes to continue from the checkpoint, and that “ngen” has been increased to request additional generations. In this case, we want it to read the data from our 10-million generations run, and stop after 20 million generations. Note that all values, other than “ngen” and “append”, must match between the two scripts; the run may fail otherwise. Running these two scripts directly from the command line yields the following

[araim1@maya-usr1 mrbayes-checkpoint]$ ls
cmds1.nex  cmds2.nex  primates.nex
[araim1@maya-usr1 mrbayes-checkpoint]$ mb cmds1.nex 
                            MrBayes v3.2.1 x64
...
   Executing file "cmds1.nex"
...
   Executing file "primates.nex"...
...
Returning execution to calling file ...
...
Chain results (10000 generations requested):
...
[araim1@maya-usr1 mrbayes-checkpoint]$ ls
cmds1.nex          primates.nex         primates.nex.mcmc    primates.nex.run2.p
cmds2.nex          primates.nex.ckp     primates.nex.run1.p  primates.nex.run2.t
primates.nex.ckp~  primates.nex.run1.t  run.slurm
[araim1@maya-usr1 mrbayes-checkpoint]$ mb cmds2.nex
                            MrBayes v3.2.1 x64
...
   Executing file "cmds2.nex"
...
   Executing file "primates.nex"...
...
   Returning execution to calling file ...
...
      Executing file "primates.nex.ckp"...
[araim1@maya-usr1 mrbayes-checkpoint]$

This mechanism can be used to plan for very long runs, by setting “ngen” accordingly, and also to recover from unexpected failures. When carrying out real runs on maya, the user should place these calls into batch scripts as illustrated in the previous section.