MrBayes

Z MetaCentrum
Skočit na navigaci Skočit na vyhledávání

Description

MrBayes is a program for the Bayesian estimation of phylogeny. Bayesian inference of phylogeny is based upon a quantity called the posterior probability distribution of trees, which is the probability of a tree conditioned on the observations. The conditioning is accomplished using Bayes's theorem. The posterior probability distribution of trees is impossible to calculate analytically; instead, MrBayes uses a simulation technique called Markov Chain Monte Carlo (or MCMC) to approximate the posterior probabilities of trees. Program runs in the command line.

There are four steps to a typical Bayesian phylogenetic analysis using MrBayes:

  • Read the Nexus data file
  • Set the evolutionary model
  • Run the analysis
  • Summarize the samples

License

GNU GPL 2

Usage

Upcoming modulesystem change alert!

Due to large number of applications and their versions it is not practical to keep them explicitly listed at our wiki pages. Therefore an upgrade of modulefiles is underway. A feature of this upgrade will be the existence of default module for every application. This default choice does not need version number and it will load some (usually latest) version.

You can test the new version now by adding a line

source /cvmfs/software.metacentrum.cz/modulefiles/5.1.0/loadmodules

to your script before loading a module. Then, you can list all versions of mrbayes and load default version of mrbayes as

module avail mrbayes/ # list available modules
module load mrbayes   # load (default) module


If you wish to keep up to the current system, it is still possible. Simply list all modules by

module avail mrbayes

and choose explicit version you want to use. Then run "mb" for one processor version or "mpiexec mb-mpi" for mpi version with more processors ("mpirun mb" with the newest module).

Modules available:

  • mrbayes
  • mrbayes-3.2
  • mrbayes-3.2.1
  • mrbayes-3.2.2
  • mrbayes-3.2.{3,4,6} (compiled with Intel compilers)
  • mrbayes-3.2.7a (compiled with Intel compilers)
  • mrbayes/mrbayes-3.2.7a-intel-19.0.4-s2htn4w (compiled with Intel compilers and BEAGLE library)

E.g.

qsub -I -l select=1:ppn=2:mem=10gb
module add mrbayes/3.2.7a-intel-19.0.4-s2htn4w
mpirun mb myInput.nex

Documentation

https://nbisweden.github.io/MrBayes/manual.html

Homepage

https://nbisweden.github.io/MrBayes/

Problems

Parallel versions 3.2 and 3.2.1 do not product correct trees, see http://sourceforge.net/projects/mrbayes/develop

When you are running version 3.1 and if you see error messages at the end of your output file like this

p0_7950:  p4_error: interrupt SIGSEGV: 11
p3_7957:  p4_error: net_recv read:  probable EOF on socket: 1
rm_l_3_7977: (41.703125) net_send: could not write to fd=5, errno = 32
p4_7967:  p4_error: net_recv read:  probable EOF on socket: 1
rm_l_4_7978: (41.699219) net_send: could not write to fd=5, errno = 32
p5_7969:  p4_error: net_recv read:  probable EOF on socket: 1
rm_l_5_7979: (41.699219) net_send: could not write to fd=5, errno = 32
p6_7971:  p4_error: net_recv read:  probable EOF on socket: 1
rm_l_6_7980: (41.699219) net_send: could not write to fd=5, errno = 32
rm_l_2_7976:  p4_error: interrupt SIGx: 15
rm_l_2_7976: (41.722656) net_send: could not write to fd=7, errno = 32
p0_7950: (57.542969) net_send: could not write to fd=4, errno = 32
p4_7967: (55.699219) net_send: could not write to fd=5, errno = 32
p5_7969: (55.699219) net_send: could not write to fd=5, errno = 32
p3_7957: (55.707031) net_send: could not write to fd=5, errno = 32
p6_7971: (55.707031) net_send: could not write to fd=5, errno = 32
mpiexec: Warning: tasks 0,3-6 exited with status 1.

use version 3.2.

Example

The beginner guide is available http://cluster.prf.jcu.cz/index.php/guides/general/bayesfromscratch

The common begin of nex file for analysis is

#NEXUS
[saved by seaview on Wed Oct 26 08:34:48 2011]
BEGIN DATA;
  DIMENSIONS NTAX=52 NCHAR=1153;
  FORMAT DATATYPE=DNA
  GAP=-
  ;
MATRIX
[1] TCCMP1185
cgaaagcctgacggagca...

and the end of file

...cctcctt
;
begin mrbayes;



	[ Set the parameters of the likelihood model, keeping prset at default conditions ]
	lset 
		nst=6
		nucmodel=4by4
		code=universal
		rates=gamma
		ngammacat=4;

	[ Set the outgroup for the analysis ]
	outgroup 52;

	[ Set Markov chain Monte Carlo parameters ]
	mcmcp
		ngen=50000
		nruns=2
		swapfreq=5
		printfreq=100
		samplefreq=100
		nchains=4
		savebrlens=yes
		ordertaxa=no
		filename=vysl-BT8;

	[ Go! ]
	mcmc;
sump burnin=250;
sumt burnin=250;

	
end;

Lines at the and (sump and sumt) are mostly necessary for 3.2 version, because they summarize the results.

Use with GNU Parallel

If you need to run more MrBayes runs in parallel (for example for more genes separately), one of possibilities is use of GNU Parallel, which is library allowing user to launch more parallel processes on more CPU cores or computers. Basic use can look like this:

module add parallel
module add mrbayes-3.2.2

ls *.nexus | parallel -j 10 'echo Start > {}.log && date >> {}.log && mb {} | tee -a {}.log && echo End: >> {}.log && date >> {}.log'

Parameter "-j 10" says to use 10 CPU cores. See manual of the function. Remaining commands produce nice logs with time stamps.

The easiest example:

ls *.nexus | parallel -j 10 'mb {}'