MrBayes is a program for Bayesian inference and model choice across a wide range of phylogenetic and evolutionary models. To run the MrBayes software interactively on the front-end node of maya, use the command
[araim1@maya-usr1 ~]$ mb
We will now demonstrate running MrBayes on the compute nodes. This was adpated from MbWiki. First, make sure to update your switcher settings as follows
[araim1@maya-usr1 ~]$ switcher mpi = gcc-mvapich2-1.4rc2 [araim1@maya-usr1 ~]$ switcher_reload
This is necessary to match the compiler and MPI implementation originally used to configure MrBayes. Next create a .nex file as follows, with your contents instead of the placeholders.
#NEXUS BEGIN DATA; ... Insert data here ... END; begin mrbayes; ... Insert MrBayes block contents here END;
Now create a small script with commands to execute the .nex file
set autoclose=yes nowarn=yes execute my.nex quit
Note that there are other ways to set up a call to MrBayes; for example, MCMC and likelihood options can be specified in a file separate from the data. The last step is to create a usual SLURM batch script. This script will run the MrBayes program on the compute nodes; the program will be run in parallel if multiple processes are requested. MyBayes will then run batch.txt, which will in turn execute our .nex file.
#!/bin/bash #SBATCH --job-name=mrbayes #SBATCH --output=slurm.out #SBATCH --error=slurm.err #SBATCH --partition=batch #SBATCH --nodes=1 #SBATCH --ntasks-per-node=8 srun mb batch.txt
Make sure you have read the how to run tutorial before attempting to use the batch system.
Setting checkpoints for long runs
For users needing very long runs of MrBayes, it is suggested to break up the work into several small jobs rather than one very long job. Long jobs have a higher probably of being interrupted by maintenance windows or unforeseen problems. Fortunately, MrBayes has a built in mechanism for creating checkpoints, where progress can be saved from one job and continued in a subsequent job.
To demonstrate this, consider the “primates.nex” example that comes with the MrBayes software. We will create two MB run scripts to analyze this data. The first script represents the initial run.
execute primates.nex; mcmc ngen=10000000 nruns=2 temp=0.02 mcmcdiag=yes samplefreq=1000 stoprule=yes stopval=0.005 relburnin=yes burninfrac=0.1 printfreq=1000 checkfreq=1000;
Notice that we set “checkfreq”, which represents the number of generations before checkpointing. The second script continues where the first script left off.
execute primates.nex; mcmc ngen=20000000 nruns=2 temp=0.02 mcmcdiag=yes samplefreq=1000 stoprule=yes stopval=0.005 relburnin=yes burninfrac=0.1 printfreq=1000 append=yes checkfreq=1000;
The only differences are the addition of the option “append=yes”, which tells MrBayes to continue from the checkpoint, and that “ngen” has been increased to request additional generations. In this case, we want it to read the data from our 10-million generations run, and stop after 20 million generations. Note that all values, other than “ngen” and “append”, must match between the two scripts; the run may fail otherwise. Running these two scripts directly from the command line yields the following
[araim1@maya-usr1 mrbayes-checkpoint]$ ls cmds1.nex cmds2.nex primates.nex [araim1@maya-usr1 mrbayes-checkpoint]$ mb cmds1.nex MrBayes v3.2.1 x64 ... Executing file "cmds1.nex" ... Executing file "primates.nex"... ... Returning execution to calling file ... ... Chain results (10000 generations requested): ... [araim1@maya-usr1 mrbayes-checkpoint]$ ls cmds1.nex primates.nex primates.nex.mcmc primates.nex.run2.p cmds2.nex primates.nex.ckp primates.nex.run1.p primates.nex.run2.t primates.nex.ckp~ primates.nex.run1.t run.slurm [araim1@maya-usr1 mrbayes-checkpoint]$ mb cmds2.nex MrBayes v3.2.1 x64 ... Executing file "cmds2.nex" ... Executing file "primates.nex"... ... Returning execution to calling file ... ... Executing file "primates.nex.ckp"... [araim1@maya-usr1 mrbayes-checkpoint]$
This mechanism can be used to plan for very long runs, by setting “ngen” accordingly, and also to recover from unexpected failures. When carrying out real runs on maya, the user should place these calls into batch scripts as illustrated in the previous section.