MrBayes
Description
MrBayes is a program for the Bayesian estimation of phylogeny. Bayesian inference of phylogeny is based upon a quantity called the posterior probability distribution of trees, which is the probability of a tree conditioned on the observations. The conditioning is accomplished using Bayes's theorem. The posterior probability distribution of trees is impossible to calculate analytically; instead, MrBayes uses a simulation technique called Markov Chain Monte Carlo (or MCMC) to approximate the posterior probabilities of trees. Program runs in the command line.
There are four steps to a typical Bayesian phylogenetic analysis using MrBayes:
- Read the Nexus data file
- Set the evolutionary model
- Run the analysis
- Summarize the samples
License
GNU GPL 2
Usage
Upcoming modulesystem change alert!
Due to large number of applications and their versions it is not practical to keep them explicitly listed at our wiki pages. Therefore an upgrade of modulefiles is underway. A feature of this upgrade will be the existence of default module for every application. This default choice does not need version number and it will load some (usually latest) version.
You can test the new version now by adding a line
source /cvmfs/software.metacentrum.cz/modulefiles/5.1.0/loadmodules
to your script before loading a module. Then, you can list all versions of mrbayes and load default version of mrbayes as
module avail mrbayes/ # list available modules module load mrbayes # load (default) module
If you wish to keep up to the current system, it is still possible. Simply list all modules by
module avail mrbayes
and choose explicit version you want to use. Then run "mb" for one processor version or "mpiexec mb-mpi" for mpi version with more processors ("mpirun mb" with the newest module).
Modules available:
mrbayes
mrbayes-3.2
mrbayes-3.2.1
mrbayes-3.2.2
mrbayes-3.2.{3,4,6}
(compiled with Intel compilers)mrbayes-3.2.7a
(compiled with Intel compilers)mrbayes/mrbayes-3.2.7a-intel-19.0.4-s2htn4w
(compiled with Intel compilers and BEAGLE library)
E.g.
qsub -I -l select=1:ppn=2:mem=10gb
module add mrbayes/3.2.7a-intel-19.0.4-s2htn4w
mpirun mb myInput.nex
Documentation
https://nbisweden.github.io/MrBayes/manual.html
Homepage
https://nbisweden.github.io/MrBayes/
Problems
Parallel versions 3.2 and 3.2.1 do not product correct trees, see http://sourceforge.net/projects/mrbayes/develop
When you are running version 3.1 and if you see error messages at the end of your output file like this
p0_7950: p4_error: interrupt SIGSEGV: 11
p3_7957: p4_error: net_recv read: probable EOF on socket: 1
rm_l_3_7977: (41.703125) net_send: could not write to fd=5, errno = 32
p4_7967: p4_error: net_recv read: probable EOF on socket: 1
rm_l_4_7978: (41.699219) net_send: could not write to fd=5, errno = 32
p5_7969: p4_error: net_recv read: probable EOF on socket: 1
rm_l_5_7979: (41.699219) net_send: could not write to fd=5, errno = 32
p6_7971: p4_error: net_recv read: probable EOF on socket: 1
rm_l_6_7980: (41.699219) net_send: could not write to fd=5, errno = 32
rm_l_2_7976: p4_error: interrupt SIGx: 15
rm_l_2_7976: (41.722656) net_send: could not write to fd=7, errno = 32
p0_7950: (57.542969) net_send: could not write to fd=4, errno = 32
p4_7967: (55.699219) net_send: could not write to fd=5, errno = 32
p5_7969: (55.699219) net_send: could not write to fd=5, errno = 32
p3_7957: (55.707031) net_send: could not write to fd=5, errno = 32
p6_7971: (55.707031) net_send: could not write to fd=5, errno = 32
mpiexec: Warning: tasks 0,3-6 exited with status 1.
use version 3.2.
Example
The beginner guide is available http://cluster.prf.jcu.cz/index.php/guides/general/bayesfromscratch
The common begin of nex file for analysis is
#NEXUS
[saved by seaview on Wed Oct 26 08:34:48 2011]
BEGIN DATA;
DIMENSIONS NTAX=52 NCHAR=1153;
FORMAT DATATYPE=DNA
GAP=-
;
MATRIX
[1] TCCMP1185
cgaaagcctgacggagca...
and the end of file
...cctcctt
;
begin mrbayes;
[ Set the parameters of the likelihood model, keeping prset at default conditions ]
lset
nst=6
nucmodel=4by4
code=universal
rates=gamma
ngammacat=4;
[ Set the outgroup for the analysis ]
outgroup 52;
[ Set Markov chain Monte Carlo parameters ]
mcmcp
ngen=50000
nruns=2
swapfreq=5
printfreq=100
samplefreq=100
nchains=4
savebrlens=yes
ordertaxa=no
filename=vysl-BT8;
[ Go! ]
mcmc;
sump burnin=250;
sumt burnin=250;
end;
Lines at the and (sump and sumt) are mostly necessary for 3.2 version, because they summarize the results.
Use with GNU Parallel
If you need to run more MrBayes runs in parallel (for example for more genes separately), one of possibilities is use of GNU Parallel, which is library allowing user to launch more parallel processes on more CPU cores or computers. Basic use can look like this:
module add parallel
module add mrbayes-3.2.2
ls *.nexus | parallel -j 10 'echo Start > {}.log && date >> {}.log && mb {} | tee -a {}.log && echo End: >> {}.log && date >> {}.log'
Parameter "-j 10" says to use 10 CPU cores. See manual of the function. Remaining commands produce nice logs with time stamps.
The easiest example:
ls *.nexus | parallel -j 10 'mb {}'