MrBayes

Z MetaCentrum
Přejít na: navigace, hledání

Description

MrBayes is a program for the Bayesian estimation of phylogeny. Bayesian inference of phylogeny is based upon a quantity called the posterior probability distribution of trees, which is the probability of a tree conditioned on the observations. The conditioning is accomplished using Bayes's theorem. The posterior probability distribution of trees is impossible to calculate analytically; instead, MrBayes uses a simulation technique called Markov Chain Monte Carlo (or MCMC) to approximate the posterior probabilities of trees. Program runs in the command line.

There are four steps to a typical Bayesian phylogenetic analysis using MrBayes:

  • Read the Nexus data file
  • Set the evolutionary model
  • Run the analysis
  • Summarize the samples

Availability

Version 3.1, beta 3.2, 3.2, 3.2.1, 3.2.2 (single processor version and also MPI version – 3.1 compiled with mpich-p4 and 3.2 with openmpi, you have to run it through mpiexec with loaded appropriate MPI modul), 3.2.3, 3.2.4 and 3.2.6. Dirichlet priors version 3.1.2 and GPU version is compiled with openmpi.

Use

Load module ("module ad mrbayes") and run "mb" for one processor version or "mpiexec mb-mpi" for mpi version with more processors.

Modules available:

  • mrbayes
  • mrbayes-pre3.2 (beta version)
  • mrbayes-3.2
  • mrbayes-3.2.1
  • mrbayes-3.2.2
  • mrbayes-3.2.3 (compiled with intel compilers)
  • mrbayes-3.2.4 (compiled with intel compilers)
  • mrbayes-3.2.6 (compiled with intel compilers)
  • mrbayes-3.1.2dir
  • mrbayes-gpu-2.1.1 (guide for GPU_clusters)

E.g.

qsub -I -l nodes=1:ppn=2 -l mem=1gb
module add mrbayes-3.2.4
mpirun mb-mpi myInput.nex

Documentation

http://mrbayes.csit.fsu.edu/wiki/index.php/Manual

License

GNU GPL 2

Supported platforms

amd64_linux26

Program administrator

meta@cesnet.cz

Homepage

http://mrbayes.csit.fsu.edu/index.php

Problems

Parallel versions 3.2 and 3.2.1 do not product correct trees, see http://sourceforge.net/projects/mrbayes/develop

When you are running version 3.1 and if you see error messages at the end of your output file like this

p0_7950:  p4_error: interrupt SIGSEGV: 11
p3_7957:  p4_error: net_recv read:  probable EOF on socket: 1
rm_l_3_7977: (41.703125) net_send: could not write to fd=5, errno = 32
p4_7967:  p4_error: net_recv read:  probable EOF on socket: 1
rm_l_4_7978: (41.699219) net_send: could not write to fd=5, errno = 32
p5_7969:  p4_error: net_recv read:  probable EOF on socket: 1
rm_l_5_7979: (41.699219) net_send: could not write to fd=5, errno = 32
p6_7971:  p4_error: net_recv read:  probable EOF on socket: 1
rm_l_6_7980: (41.699219) net_send: could not write to fd=5, errno = 32
rm_l_2_7976:  p4_error: interrupt SIGx: 15
rm_l_2_7976: (41.722656) net_send: could not write to fd=7, errno = 32
p0_7950: (57.542969) net_send: could not write to fd=4, errno = 32
p4_7967: (55.699219) net_send: could not write to fd=5, errno = 32
p5_7969: (55.699219) net_send: could not write to fd=5, errno = 32
p3_7957: (55.707031) net_send: could not write to fd=5, errno = 32
p6_7971: (55.707031) net_send: could not write to fd=5, errno = 32
mpiexec: Warning: tasks 0,3-6 exited with status 1.

use version 3.2.

Example

The beginner guide is available http://cluster.prf.jcu.cz/index.php/guides/general/bayesfromscratch

The common begin of nex file for analysis is

#NEXUS
[saved by seaview on Wed Oct 26 08:34:48 2011]
BEGIN DATA;
  DIMENSIONS NTAX=52 NCHAR=1153;
  FORMAT DATATYPE=DNA
  GAP=-
  ;
MATRIX
[1] TCCMP1185
cgaaagcctgacggagca...

and the end of file

...cctcctt
;
begin mrbayes;



	[ Set the parameters of the likelihood model, keeping prset at default conditions ]
	lset 
		nst=6
		nucmodel=4by4
		code=universal
		rates=gamma
		ngammacat=4;

	[ Set the outgroup for the analysis ]
	outgroup 52;

	[ Set Markov chain Monte Carlo parameters ]
	mcmcp
		ngen=50000
		nruns=2
		swapfreq=5
		printfreq=100
		samplefreq=100
		nchains=4
		savebrlens=yes
		ordertaxa=no
		filename=vysl-BT8;

	[ Go! ]
	mcmc;
sump burnin=250;
sumt burnin=250;

	
end;

Lines at the and (sump and sumt) are mostly necessary for 3.2 version, because they summarize the results.

Use with GNU Parallel

If you need to run more MrBayes runs in parallel (for example for more genes separately), one of possibilities is use of GNU Parallel, which is library allowing user to launch more parallel processes on more CPU cores or computers. Basic use can look like this:

module add parallel
module add mrbayes-3.2.2

ls *.nexus | parallel -j 10 'echo Start > {}.log && date >> {}.log && mb {} | tee -a {}.log && echo End: >> {}.log && date >> {}.log'

Parameter "-j 10" says to use 10 CPU cores. See manual of the function. Remaining commands produce nice logs with time stamps.

The easiest example:

ls *.nexus | parallel -j 10 'mb {}'