HOME

TheInfoList



OR:

BAli-Phy is a free software program for simultaneously estimating a
multiple sequence alignment Multiple sequence alignment (MSA) is the process or the result of sequence alignment of three or more biological sequences, generally protein, DNA, or RNA. These alignments are used to infer evolutionary relationships via phylogenetic analysis an ...
and its
phylogenetic tree A phylogenetic tree or phylogeny is a graphical representation which shows the evolutionary history between a set of species or taxa during a specific time.Felsenstein J. (2004). ''Inferring Phylogenies'' Sinauer Associates: Sunderland, MA. In ...
. BAli-Phy achieves high accuracy in alignment estimation by using information from the co-estimated phylogeny. BAli-Phy takes alignment uncertainty into account while estimating the phylogeny by averaging over possible alignments. Unlike most phylogeny inference software, input sequences need not be aligned beforehand. This differs from traditional approaches to alignment and phylogeny estimation, which first estimate the alignment without a high-quality tree estimate, and then estimate the tree given the alignment. BAli-Phy produces a Bayesian
posterior distribution The posterior probability is a type of conditional probability that results from updating the prior probability with information summarized by the likelihood via an application of Bayes' rule. From an epistemological perspective, the posterior ...
on both the alignments and the tree. The software shows uncertainty in both the alignment and the tree. BAli-Phy uses
Markov chain Monte Carlo In statistics, Markov chain Monte Carlo (MCMC) is a class of algorithms used to draw samples from a probability distribution. Given a probability distribution, one can construct a Markov chain whose elements' distribution approximates it – that ...
methods for estimation. It can take several days to run.


Alignment uncertainty

Alignment uncertainty stems from two main sources: near-optimal alignments and evolutionary parameter uncertainty. Evolutionary parameters include branch lengths, substitution rates, insertion/deletion rates, and the phylogeny itself. If the exact value for these parameters is unknown, and the alignment estimate is sensitive to the parameter, then the alignment cannot be known with confidence. Even when evolutionary parameters are fully known, many different alignments may be optimal, or nearly optimal. In this case, the researcher cannot have confidence in any single alignment, but must average over the cloud of near-optimal alignments. BAli-Phy can handle both near-optimal alignment uncertainty and evolutionary parameter uncertainty by integrating over possible alignments and parameter values.


Input and output

BAli-Phy accepts
nucleotide Nucleotides are Organic compound, organic molecules composed of a nitrogenous base, a pentose sugar and a phosphate. They serve as monomeric units of the nucleic acid polymers – deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), both o ...
,
amino acid Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although over 500 amino acids exist in nature, by far the most important are the 22 α-amino acids incorporated into proteins. Only these 22 a ...
, and
codon Genetic code is a set of rules used by living cells to translate information encoded within genetic material (DNA or RNA sequences of nucleotide triplets or codons) into proteins. Translation is accomplished by the ribosome, which links prote ...
sequences in
FASTA format In bioinformatics and biochemistry, the FASTA format is a text-based format for representing either nucleotide sequences or amino acid (protein) sequences, in which nucleotides or amino acids are represented using single-letter codes. The for ...
. Input sequences need not be aligned. Ambiguous nucleotides such as R and Y are supported, as are the ambiguous amino acids B, Z, and J. Trees are output in Newick format. Alignments are output in FASTA format. Output alignments include homology information for sequences at internal nodes of the tree.


See also

* Sequence alignment software


External links

* {{Official website, http://www.bali-phy.org/ Phylogenetics software