Inferring horizontal gene transfer
   HOME

TheInfoList



OR:

Horizontal or lateral gene transfer (HGT or LGT) is the transmission of portions of genomic DNA between organisms through a process decoupled from vertical inheritance. In the presence of HGT events, different fragments of the
genome In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses). The nuclear genome includes protein-coding genes and non-coding g ...
are the result of different
evolution Evolution is change in the heritable characteristics of biological populations over successive generations. These characteristics are the expressions of genes, which are passed on from parent to offspring during reproduction. Variation ...
ary histories. This can therefore complicate investigations of the evolutionary relatedness of lineages and species. Also, as HGT can bring into genomes radically different genotypes from distant lineages, or even new
gene In biology, the word gene (from , ; "... Wilhelm Johannsen coined the word gene to describe the Mendelian units of heredity..." meaning ''generation'' or ''birth'' or ''gender'') can have several different meanings. The Mendelian gene is a b ...
s bearing new functions, it is a major source of
phenotypic In genetics, the phenotype () is the set of observable characteristics or traits of an organism. The term covers the organism's morphology or physical form and structure, its developmental processes, its biochemical and physiological proper ...
innovation and a mechanism of niche adaptation. For example, of particular relevance to human health is the lateral transfer of antibiotic resistance and
pathogenicity In biology, a pathogen ( el, πάθος, "suffering", "passion" and , "producer of") in the oldest and broadest sense, is any organism or agent that can produce disease. A pathogen may also be referred to as an infectious agent, or simply a ge ...
determinants, leading to the emergence of pathogenic lineages. Inferring horizontal gene transfer through
computational Computation is any type of arithmetic or non-arithmetic calculation that follows a well-defined model (e.g., an algorithm). Mechanical or electronic devices (or, historically, people) that perform computations are known as ''computers''. An espe ...
identification of HGT events relies upon the investigation of sequence composition or evolutionary history of genes. Sequence composition-based ("parametric") methods search for deviations from the genomic average whereas evolutionary history-based ("
phylogenetic In biology, phylogenetics (; from Greek φυλή/ φῦλον [] "tribe, clan, race", and wikt:γενετικός, γενετικός [] "origin, source, birth") is the study of the evolutionary history and relationships among or within groups o ...
") approaches identify genes whose evolutionary history significantly differs from that of the host
species In biology, a species is the basic unit of classification and a taxonomic rank of an organism, as well as a unit of biodiversity. A species is often defined as the largest group of organisms in which any two individuals of the appropriate s ...
. The evaluation and benchmarking of HGT inference methods typically rely upon simulated genomes, for which the true history is known. On real data, different methods tend to infer different HGT events, and as a result it can be difficult to ascertain all but simple and clear-cut HGT events.


Overview

Horizontal gene transfer was first observed in 1928, in
Frederick Griffith Frederick Griffith (1877–1941) was a British bacteriologist whose focus was the epidemiology and pathology of bacterial pneumonia. In January 1928 he reported what is now known as Griffith's Experiment, the first widely accepted demonstrati ...
's
experiment An experiment is a procedure carried out to support or refute a hypothesis, or determine the efficacy or likelihood of something previously untried. Experiments provide insight into Causality, cause-and-effect by demonstrating what outcome oc ...
: showing that virulence was able to pass from virulent to non-virulent strains of ''
Streptococcus pneumoniae ''Streptococcus pneumoniae'', or pneumococcus, is a Gram-positive, spherical bacteria, alpha-hemolytic (under aerobic conditions) or beta-hemolytic (under anaerobic conditions), aerotolerant anaerobic member of the genus Streptococcus. They ar ...
'', Griffith demonstrated that genetic information can be horizontally transferred between
bacteria Bacteria (; singular: bacterium) are ubiquitous, mostly free-living organisms often consisting of one Cell (biology), biological cell. They constitute a large domain (biology), domain of prokaryotic microorganisms. Typically a few micrometr ...
via a mechanism known as
transformation Transformation may refer to: Science and mathematics In biology and medicine * Metamorphosis, the biological process of changing physical form after birth or hatching * Malignant transformation, the process of cells becoming cancerous * Tran ...
. Similar observations in the 1940s and 1950s showed evidence that
conjugation Conjugation or conjugate may refer to: Linguistics * Grammatical conjugation, the modification of a verb from its basic form * Emotive conjugation or Russell's conjugation, the use of loaded language Mathematics * Complex conjugation, the chang ...
and transduction are additional mechanisms of horizontal gene transfer. To infer HGT events, which may not necessarily result in
phenotypic In genetics, the phenotype () is the set of observable characteristics or traits of an organism. The term covers the organism's morphology or physical form and structure, its developmental processes, its biochemical and physiological proper ...
changes, most contemporary methods are based on analyses of genomic sequence data. These methods can be broadly separated into two groups: parametric and phylogenetic methods. Parametric methods search for sections of a genome that significantly differ from the genomic average, such as
GC content In molecular biology and genetics, GC-content (or guanine-cytosine content) is the percentage of nitrogenous bases in a DNA or RNA molecule that are either guanine (G) or cytosine (C). This measure indicates the proportion of G and C bases out ...
or codon usage. Phylogenetic methods examine evolutionary histories of genes involved and identify conflicting phylogenies. Phylogenetic methods can be further divided into those that reconstruct and compare phylogenetic trees explicitly, and those that use surrogate measures in place of the phylogenetic trees. The main feature of parametric methods is that they only rely on the genome under study to infer HGT events that may have occurred on its lineage. It has been a considerable advantage at the early times of the sequencing era, when few closely related genomes were available for comparative methods. However, because they rely on the uniformity of the host's signature to infer HGT events, not accounting for the host's intra-genomic variability will result in overpredictions—flagging native segments as possible HGT events. Similarly, the transferred segments need to exhibit the donor's signature and to be significantly different from the recipient's. Furthermore, genomic segments of foreign origin are subject to the same
mutation In biology, a mutation is an alteration in the nucleic acid sequence of the genome of an organism, virus, or extrachromosomal DNA. Viral genomes contain either DNA or RNA. Mutations result from errors during DNA replication, DNA or viral repl ...
al processes as the rest of the host genome, and so the difference between the two tends to vanish over time, a process referred to as amelioration. This limits the ability of parametric methods to detect ancient HGTs. Phylogenetic methods benefit from the recent availability of many sequenced genomes. Indeed, as for all
comparative general linguistics, the comparative is a syntactic construction that serves to express a comparison between two (or more) entities or groups of entities in quality or degree - see also comparison (grammar) for an overview of comparison, as well ...
methods, phylogenetic methods can integrate information from multiple genomes, and in particular integrate them using a model of evolution. This lends them the ability to better characterize the HGT events they infer—notably by designating the donor species and time of the transfer. However, models have limits and need to be used cautiously. For instance, the conflicting phylogenies can be the result of events not accounted for by the model, such as unrecognized
paralogy Sequence homology is the biological homology between DNA, RNA, or protein sequences, defined in terms of shared ancestry in the evolutionary history of life. Two segments of DNA can have shared ancestry because of three phenomena: either a s ...
due to
duplication Duplication, duplicate, and duplicator may refer to: Biology and genetics * Gene duplication, a process which can result in free mutation * Chromosomal duplication, which can cause Bloom and Rett syndrome * Polyploidy, a phenomenon also known ...
followed by gene losses. Also, many approaches rely on a reference species tree that is supposed to be known, when in many instances it can be difficult to obtain a reliable tree. Finally, the computational costs of reconstructing many gene/species trees can be prohibitively expensive. Phylogenetic methods tend to be applied to genes or
protein sequence Protein primary structure is the linear sequence of amino acids in a peptide or protein. By convention, the primary structure of a protein is reported starting from the amino-terminal (N) end to the carboxyl-terminal (C) end. Protein biosynthes ...
s as basic evolutionary units, which limits their ability to detect HGT in regions outside or across gene boundaries. Because of their complementary approaches—and often non-overlapping sets of HGT candidates—combining
predictions A prediction (Latin ''præ-'', "before," and ''dicere'', "to say"), or forecast, is a statement about a future event or data. They are often, but not always, based upon experience or knowledge. There is no universal agreement about the exact ...
from parametric and phylogenetic methods can yield a more comprehensive set of HGT
candidate gene The candidate gene approach to conducting genetic association studies focuses on associations between genetic variation within pre-specified genes of interest, and phenotypes or disease states. This is in contrast to genome-wide association studies ...
s. Indeed, combining different parametric methods has been reported to significantly improve the quality of predictions. Moreover, in the absence of a comprehensive set of true horizontally transferred genes, discrepancies between different methods might be resolved through combining parametric and phylogenetic methods. However, combining inferences from multiple methods also entails a risk of an increased false-positive rate.


Parametric methods

Parametric methods to infer HGT use characteristics of the genome sequence specific to particular species or clades, also called genomic signatures. If a fragment of the genome strongly deviates from the genomic signature, this is a sign of a potential horizontal transfer. For example, because bacterial GC content falls within a wide range, GC content of a genome segment is a simple genomic signature. Commonly used genomic signatures include
nucleotide Nucleotides are organic molecules consisting of a nucleoside and a phosphate. They serve as monomeric units of the nucleic acid polymers – deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), both of which are essential biomolecule ...
composition,
oligonucleotide Oligonucleotides are short DNA or RNA molecules, oligomers, that have a wide range of applications in genetic testing, research, and forensics. Commonly made in the laboratory by solid-phase chemical synthesis, these small bits of nucleic acids ...
frequencies, or structural features of the genome. To detect HGT using parametric methods, the host's genomic signature needs to be clearly recognizable. However, the host's genome is not always uniform with respect to the genome signature: for example, GC content of the third codon position is lower close to the replication terminus and GC content tends to be higher in highly expressed genes. Not accounting for such intra-genomic variability in the host can result in over-predictions, flagging native segments as HGT candidates. Larger sliding windows can account for this variability at the cost of a reduced ability to detect smaller HGT regions. Just as importantly, horizontally transferred segments need to exhibit the donor's genomic signature. This might not be the case for ancient transfers where transferred sequences are subjected to the same mutational processes as the rest of the host genome, potentially causing their distinct signatures to "ameliorate" and become undetectable through parametric methods. For example, '' Bdellovibrio bacteriovorus'', a predatory δ-Proteobacterium, has homogeneous GC content, and it might be concluded that its genome is resistant to HGT. However, subsequent analysis using phylogenetic methods identified a number of ancient HGT events in the genome of ''B. bacteriovorus''. Similarly, if the inserted segment was previously ameliorated to the host's genome, as is the case for prophage insertions, parametric methods might miss predicting these HGT events. Also, the donor's composition must significantly differ from the recipient's to be identified as abnormal, a condition that might be missed in the case of short- to medium-distance HGT, which are the most prevalent. Furthermore, it has been reported that recently acquired genes tend to be AT-richer than the recipient's average, which indicates that differences in GC-content signature may result from unknown post-acquisition mutational processes rather than from the donor's genome.


Nucleotide composition

Bacterial GC content falls within a wide range, with ''Ca. Zinderia insecticola'' having a GC content of 13.5% and '' Anaeromyxobacter dehalogenans'' having a GC content of 75%. Even within a closely related group of α-Proteobacteria, values range from approximately 30% to 65%. These differences can be exploited when detecting HGT events as a significantly different GC content for a genome segment can be an indication of foreign origin.


Oligonucleotide spectrum

The oligonucleotide spectrum (or
k-mer In bioinformatics, ''k''-mers are substrings of length k contained within a biological sequence. Primarily used within the context of computational genomics and sequence analysis, in which ''k''-mers are composed of nucleotides (''i.e''. A, T, G ...
frequencies) measures the frequency of all possible nucleotide sequences of a particular length in the genome. It tends to vary less within genomes than between genomes and therefore can also be used as a genomic signature. A deviation from this signature suggests that a genomic segment might have arrived through horizontal transfer. The oligonucleotide spectrum owes much of its discriminatory power to the number of possible oligonucleotides: if n is the size of the vocabulary and w is oligonucleotide size, the number of possible distinct oligonucleotides is nw; for example, there are 45=1024 possible pentanucleotides. Some methods can capture the signal recorded in motifs of variable size, thus capturing both rare and discriminative motifs along with frequent, but more common ones.
Codon usage bias Codon usage bias refers to differences in the frequency of occurrence of synonymous codons in coding DNA. A codon is a series of three nucleotides (a triplet) that encodes a specific amino acid residue in a polypeptide chain or for the termination ...
, a measure related to codon frequencies, was one of the first detection methods used in methodical assessments of HGT. This approach requires a host genome which contains a bias towards certain synonymous codons (different codons which code for the same amino acid) which is clearly distinct from the bias found within the donor genome. The simplest oligonucleotide used as a genomic signature is the dinucleotide, for example the third nucleotide in a codon and the first nucleotide in the following codon represent the dinucleotide least restricted by
amino acid Amino acids are organic compounds that contain both amino and carboxylic acid functional groups. Although hundreds of amino acids exist in nature, by far the most important are the alpha-amino acids, which comprise proteins. Only 22 alpha a ...
preference and codon usage. It is important to optimise the size of the sliding window in which to count the oligonucleotide frequency: a larger sliding window will better buffer variability in the host genome at the cost of being worse at detecting smaller HGT regions. A good compromise has been reported using tetranucleotide frequencies in a sliding window of 5  kb with a step of 0.5kb. A convenient method of modelling oligonucleotide genomic signatures is to use Markov chains. The transition probability matrix can be derived for endogenous vs. acquired genes, from which Bayesian
posterior probabilities The posterior probability is a type of conditional probability that results from updating the prior probability with information summarized by the likelihood via an application of Bayes' rule. From an epistemological perspective, the posterior ...
for particular stretches of DNA can be obtained.


Structural features

Just as the nucleotide composition of a DNA molecule can be represented by a sequence of letters, its structural features can be encoded in a numerical sequence. The structural features include interaction energies between neighbouring base pairs, the angle of twist that makes two bases of a
pair Pair or PAIR or Pairing may refer to: Government and politics * Pair (parliamentary convention), matching of members unable to attend, so as not to change the voting margin * ''Pair'', a member of the Prussian House of Lords * ''Pair'', the Frenc ...
non-
coplanar In geometry, a set of points in space are coplanar if there exists a geometric plane that contains them all. For example, three points are always coplanar, and if the points are distinct and non-collinear, the plane they determine is unique. How ...
, or DNA deformability induced by the proteins shaping the chromatin. The autocorrelation analysis of some of these numerical sequences show characteristic periodicities in complete genomes. In fact, after detecting archaea-like regions in the
thermophilic A thermophile is an organism—a type of extremophile—that thrives at relatively high temperatures, between . Many thermophiles are archaea, though they can be bacteria or fungi. Thermophilic eubacteria are suggested to have been among the earl ...
bacteria ''
Thermotoga maritima ''Thermotoga maritima'' is a hyperthermophilic, anaerobic organism that is a member of the order Thermotogales. ''T. maritima'' is well known for its ability to produce hydrogen (clean energy) and it is the only fermentative bacterium that has b ...
'', periodicity spectra of these regions were compared to the periodicity spectra of the homologous regions in the archaea '' Pyrococcus horikoshii''. The revealed similarities in the periodicity were strong supporting evidence for a case of massive HGT between the bacteria and the archaea
kingdoms Kingdom commonly refers to: * A monarchy ruled by a king or queen * Kingdom (biology), a category in biological taxonomy Kingdom may also refer to: Arts and media Television * ''Kingdom'' (British TV series), a 2007 British television drama s ...
.


Genomic context

The existence of
genomic island A genomic island (GI) is part of a genome that has evidence of horizontal origins. The term is usually used in microbiology, especially with regard to bacteria. A GI can code for many functions, can be involved in symbiosis or pathogenesis, an ...
s, short (typically 10–200kb long) regions of a genome which have been acquired horizontally, lends support to the ability to identify non-native genes by their
location In geography, location or place are used to denote a region (point, line, or area) on Earth's surface or elsewhere. The term ''location'' generally implies a higher degree of certainty than ''place'', the latter often indicating an entity with an ...
in a genome. For example, a gene of ambiguous origin which forms part of a non-native
operon In genetics, an operon is a functioning unit of DNA containing a cluster of genes under the control of a single promoter. The genes are transcribed together into an mRNA strand and either translated together in the cytoplasm, or undergo splic ...
could be considered to be non-native. Alternatively, flanking
repeat sequences Repeated sequences (also known as repetitive elements, repeating units or repeats) are short or long patterns of nucleic acids (DNA or RNA) that occur in multiple copies throughout the genome. In many organisms, a significant fraction of the genom ...
or the presence of nearby integrases or
transposase A transposase is any of a class of enzymes capable of binding to the end of a transposon and catalysing its movement to another part of a genome, typically by a cut-and-paste mechanism or a replicative mechanism, in a process known as transpositio ...
s can indicate a non-native region. A machine-learning approach combining oligonucleotide frequency scans with context information was reported to be effective at identifying genomic islands. In another study, the context was used as a secondary indicator, after removal of genes which are strongly thought to be native or non-native through the use of other parametric methods.


Phylogenetic methods

The use of phylogenetic analysis in the detection of HGT was advanced by the availability of many newly sequenced genomes. Phylogenetic methods detect inconsistencies in gene and species evolutionary history in two ways: explicitly, by reconstructing the gene tree and reconciling it with the reference species tree, or implicitly, by examining aspects that correlate with the evolutionary history of the genes in question, e.g., patterns of presence/absence across species, or unexpectedly short or distant pairwise evolutionary distances.


Explicit phylogenetic methods

The aim of explicit phylogenetic methods is to compare gene trees with their associated species trees. While weakly supported differences between gene and species trees can be due to inference uncertainty, statistically significant differences can be suggestive of HGT events. For example, if two genes from different species share the most recent ancestral connecting node in the gene tree, but the respective species are spaced apart in the species tree, an HGT event can be invoked. Such an approach can produce more detailed results than parametric approaches because the involved species, time and direction of transfer can potentially be identified. As discussed in more detail below, phylogenetic methods range from simple methods merely identifying discordance between gene and species trees to mechanistic models inferring probable sequences of HGT events. An intermediate strategy entails deconstructing the gene tree into smaller parts until each matches the species tree (genome spectral approaches). Explicit phylogenetic methods rely upon the accuracy of the input rooted gene and species trees, yet these can be challenging to build. Even when there is no doubt in the input trees, the conflicting phylogenies can be the result of evolutionary processes other than HGT, such as duplications and losses, causing these methods to erroneously infer HGT events when
paralogy Sequence homology is the biological homology between DNA, RNA, or protein sequences, defined in terms of shared ancestry in the evolutionary history of life. Two segments of DNA can have shared ancestry because of three phenomena: either a s ...
is the correct explanation. Similarly, in the presence of
incomplete lineage sorting Incomplete lineage sorting, also termed hemiplasy, deep coalescence, retention of ancestral polymorphism, or trans-species polymorphism, describes a phenomenon in population genetics when ancestral gene copies fail to coalesce (looking backwards i ...
, explicit phylogeny methods can erroneously infer HGT events. That is why some explicit model-based methods test multiple evolutionary scenarios involving different kinds of events, and compare their fit to the data given
parsimonious Occam's razor, Ockham's razor, or Ocham's razor ( la, novacula Occami), also known as the principle of parsimony or the law of parsimony ( la, lex parsimoniae), is the problem-solving principle that "entities should not be multiplied beyond neces ...
or probabilistic criteria.


Tests of topologies

To detect sets of genes that fit poorly to the reference tree, one can use statistical tests of topology, such as the Kishino–Hasegawa (KH), Shimodaira–Hasegawa (SH), and Approximately Unbiased (AU) tests. These tests assess the likelihood of the gene sequence alignment when the reference topology is given as the null hypothesis. The rejection of the reference
topology In mathematics, topology (from the Greek words , and ) is concerned with the properties of a geometric object that are preserved under continuous deformations, such as stretching, twisting, crumpling, and bending; that is, without closing ...
is an indication that the evolutionary history for that gene family is inconsistent with the reference tree. When these inconsistencies cannot be explained using a small number of non-horizontal events, such as gene loss and duplication, an HGT event is inferred. One such analysis checked for HGT in groups of homologs of the γ-Proteobacterial lineage. Six reference trees were reconstructed using either the highly conserved small subunit ribosomal RNA sequences, a consensus of the available gene trees or concatenated alignments of
orthologs Sequence homology is the biological homology between DNA, RNA, or protein sequences, defined in terms of shared ancestry in the evolutionary history of life. Two segments of DNA can have shared ancestry because of three phenomena: either a sp ...
. The failure to reject the six evaluated topologies, and the rejection of seven alternative topologies, was interpreted as evidence for a small number of HGT events in the selected groups. Tests of topology identify differences in tree topology taking into account the uncertainty in tree inference but they make no attempt at inferring ''how'' the differences came about. To infer the specifics of particular events, genome spectral or subtree pruning and regraft methods are required.


Genome spectral approaches

In order to identify the location of HGT events, genome spectral approaches decompose a gene tree into substructures (such as bipartitions or quartets) and identify those that are consistent or inconsistent with the species tree. Bipartitions Removing one
edge Edge or EDGE may refer to: Technology Computing * Edge computing, a network load-balancing system * Edge device, an entry point to a computer network * Adobe Edge, a graphical development application * Microsoft Edge, a web browser developed ...
from a reference tree produces two unconnected sub-trees, each a disjoint set of nodes—a bipartition. If a bipartition is present in both the gene and the species trees, it is compatible; otherwise, it is conflicting. These conflicts can indicate an HGT event or may be the result of uncertainty in gene tree inference. To reduce uncertainty, bipartition analyses typically focus on strongly supported bipartitions such as those associated with branches with bootstrap values or posterior probabilities above certain thresholds. Any gene family found to have one or several conflicting, but strongly supported, bipartitions is considered as an HGT candidate. Quartet decomposition Quartets are trees consisting of four leaves. In bifurcating (fully resolved) trees, each internal branch induces a quartet whose leaves are either subtrees of the original tree or actual leaves of the original tree. If the topology of a quartet extracted from the reference species tree is embedded in the gene tree, the quartet is compatible with the gene tree. Conversely, incompatible strongly supported quartets indicate potential HGT events. Quartet mapping methods are much more computationally efficient and naturally handle heterogeneous representation of taxa among gene families, making them a good basis for developing large-scale scans for HGT, looking for highways of gene sharing in databases of hundreds of complete genomes.


Subtree pruning and regrafting

A mechanistic way of modelling an HGT event on the reference tree is to first cut an internal branch—i.e., prune the tree—and then regraft it onto another edge, an operation referred to as subtree pruning and regrafting (SPR). If the gene tree was topologically consistent with the original reference tree, the editing results in an inconsistency. Similarly, when the original gene tree is inconsistent with the reference tree, it is possible to obtain a consistent topology by a series of one or more prune and regraft operations applied to the reference tree. By interpreting the edit path of pruning and regrafting, HGT candidate nodes can be flagged and the host and donor genomes inferred. To avoid reporting false positive HGT events due to uncertain gene tree topologies, the optimal "path" of SPR operations can be chosen among multiple possible combinations by considering the branch support in the gene tree. Weakly supported gene tree edges can be ignored a priori or the support can be used to compute an optimality criterion. Because conversion of one tree to another by a minimum number of SPR operations is NP-Hard, solving the problem becomes considerably more difficult as more nodes are considered. The computational challenge lies in finding the optimal edit path, i.e., the one that requires the fewest steps, and different strategies are used in solving the problem. For example, the HorizStory algorithm reduces the problem by first eliminating the consistent nodes; recursive pruning and regrafting reconciles the reference tree with the gene tree and optimal edits are interpreted as HGT events. The SPR methods included in the supertree reconstruction package SPRSupertrees substantially decrease the time of the search for the optimal set of SPR operations by considering multiple localised sub-problems in large trees through a clustering approach. The
T-REX (webserver) T-REX (Tree and Reticulogram Reconstruction) is a freely available web server, developed at the department of Computer Science of the Université du Québec à Montréal, dedicated to the inference, validation and visualization of phylogenetic tr ...
includes a number of HGT detection methods (mostly SPR-based) and allows users to calculate the bootstrap support of the inferred transfers.


Model-based reconciliation methods

Reconciliation of gene and species trees entails mapping evolutionary events onto gene trees in a way that makes them concordant with the species tree. Different reconciliation models exist, differing in the types of event they consider to explain the incongruences between gene and species tree topologies. Early methods exclusively modelled horizontal transfers (T). More recent ones also account for duplication (D), loss (L),
incomplete lineage sorting Incomplete lineage sorting, also termed hemiplasy, deep coalescence, retention of ancestral polymorphism, or trans-species polymorphism, describes a phenomenon in population genetics when ancestral gene copies fail to coalesce (looking backwards i ...
(ILS) or
homologous recombination Homologous recombination is a type of genetic recombination in which genetic information is exchanged between two similar or identical molecules of double-stranded or single-stranded nucleic acids (usually DNA as in cellular organisms but may ...
(HR) events. The difficulty is that by allowing for multiple types of events, the number of possible reconciliations increases rapidly. For instance, a conflicting gene tree topologies might be explained in terms of a single HGT event or multiple duplication and loss events. Both alternatives can be considered plausible reconciliation depending on the frequency of these respective events along the species tree. Reconciliation methods can rely on a
parsimonious Occam's razor, Ockham's razor, or Ocham's razor ( la, novacula Occami), also known as the principle of parsimony or the law of parsimony ( la, lex parsimoniae), is the problem-solving principle that "entities should not be multiplied beyond neces ...
or a probabilistic framework to infer the most likely scenario(s), where the relative cost/probability of D, T, L events can be fixed a priori or estimated from the data. The space of DTL reconciliations and their parsimony costs—which can be extremely vast for large multi-copy gene family trees—can be efficiently explored through
dynamic programming Dynamic programming is both a mathematical optimization method and a computer programming method. The method was developed by Richard Bellman in the 1950s and has found applications in numerous fields, from aerospace engineering to economics. ...
algorithms. In some programs, the gene tree topology can be refined where it was uncertain to fit a better evolutionary scenario as well as the initial sequence alignment. More refined models account for the biased frequency of HGT between closely related lineages, reflecting the loss of efficiency of HR with phylogenetic distance, for ILS, or for the fact that the actual donor of most HGT belong to extinct or unsampled lineages. Further extensions of DTL models are being developed towards an integrated description of the genome evolution processes. In particular, some of them consider horizontal at multiple scales—modelling independent evolution of gene fragments or recognising
co-evolution In biology, coevolution occurs when two or more species reciprocally affect each other's evolution through the process of natural selection. The term sometimes is used for two traits in the same species affecting each other's evolution, as well ...
of several genes (e.g., due to co-transfer) within and across genomes.


Implicit phylogenetic methods

In contrast to explicit phylogenetic methods, which compare the agreement between gene and species trees, implicit phylogenetic methods compare evolutionary distances or sequence similarity. Here, an unexpectedly short or long distance from a given reference compared to the average can be suggestive of an HGT event. Because tree construction is not required, implicit approaches tend to be simpler and faster than explicit methods. However, implicit methods can be limited by disparities between the underlying correct phylogeny and the evolutionary distances considered. For instance, the most similar sequence as obtained by the highest-scoring
BLAST Blast or The Blast may refer to: *Explosion, a rapid increase in volume and release of energy in an extreme manner *Detonation, an exothermic front accelerating through a medium that eventually drives a shock front Film * ''Blast'' (1997 film), ...
hit is not always the evolutionarily closest one.


Top sequence match in a distant species

A simple way of identifying HGT events is by looking for high-scoring sequence matches in distantly related species. For example, an analysis of the top BLAST hits of protein sequences in the bacteria ''Thermotoga maritima'' revealed that most hits were in archaea rather than closely related bacteria, suggesting extensive HGT between the two; these predictions were later supported by an analysis of the structural features of the DNA molecule. However, this method is limited to detecting relatively recent HGT events. Indeed, if the HGT occurred in the
common ancestor Common descent is a concept in evolutionary biology applicable when one species is the ancestor of two or more species later in time. All living beings are in fact descendants of a unique ancestor commonly referred to as the last universal comm ...
of two or more species included in the database, the closest hit will reside within that clade and therefore the HGT will not be detected by the method. Thus, the threshold of the minimum number of foreign top BLAST hits to observe to decide a gene was transferred is highly dependent on the taxonomic coverage of sequence databases. Therefore, experimental settings may need to be defined in an ad-hoc way.


Discrepancy between gene and species distances

The
molecular clock The molecular clock is a figurative term for a technique that uses the mutation rate of biomolecules to deduce the time in prehistory when two or more life forms diverged. The biomolecular data used for such calculations are usually nucleo ...
hypothesis posits that homologous genes evolve at an approximately constant rate across different species. If one only considers homologous genes related through speciation events (referred to as “orthologous" genes), their underlying tree should by definition correspond to the species tree. Therefore, assuming a molecular clock, the evolutionary distance between orthologous genes should be approximately proportional to the evolutionary distances between their respective species. If a putative group of orthologs contains xenologs (pairs of genes related through an HGT), the proportionality of evolutionary distances may only hold among the orthologs, not the xenologs. Simple approaches compare the distribution of similarity scores of particular sequences and their orthologous counterparts in other species; HGT are inferred from outliers. The more sophisticated DLIGHT ('Distance Likelihood-based Inference of Genes Horizontally Transferred') method considers simultaneously the effect of HGT on all sequences within groups of putative orthologs: if a
likelihood-ratio test In statistics, the likelihood-ratio test assesses the goodness of fit of two competing statistical models based on the ratio of their likelihoods, specifically one found by maximization over the entire parameter space and another found after im ...
of the HGT hypothesis versus a hypothesis of no HGT is significant, a putative HGT event is inferred. In addition, the method allows inference of potential donor and recipient species and provides an estimation of the time since the HGT event.


Phylogenetic profiles

A group of orthologous or homologous genes can be analysed in terms of the presence or absence of group members in the reference genomes; such patterns are called phylogenetic profiles. To find HGT events, phylogenetic profiles are scanned for an unusual distribution of genes. Absence of a homolog in some members of a group of closely related species is an indication that the examined gene might have arrived via an HGT event. For example, the three facultatively symbiotic '' Frankia sp.'' strains are of strikingly different sizes: 5.43 Mbp, 7.50 Mbp and 9.04 Mbp, depending on their range of hosts. Marked portions of strain-specific genes were found to have no significant hit in the reference database, and were possibly acquired by HGT transfers from other bacteria. Similarly, the three phenotypically diverse ''
Escherichia coli ''Escherichia coli'' (),Wells, J. C. (2000) Longman Pronunciation Dictionary. Harlow ngland Pearson Education Ltd. also known as ''E. coli'' (), is a Gram-negative, facultative anaerobic, rod-shaped, coliform bacterium of the genus '' Esc ...
'' strains ( uropathogenic,
enterohemorrhagic Shigatoxigenic ''Escherichia coli'' (STEC) and verotoxigenic ''E. coli'' (VTEC) are strains of the bacterium ''Escherichia coli'' that produce Shiga toxin (or verotoxin). Only a minority of the strains cause illness in humans. The ones that do ...
and benign) share about 40% of the total combined gene pool, with the other 60% being strain-specific genes and consequently HGT candidates. Further evidence for these genes resulting from HGT was their strikingly different codon usage patterns from the core genes and a lack of gene order conservation (order conservation is typical of vertically evolved genes). The presence/absence of homologs (or their effective count) can thus be used by programs to reconstruct the most likely evolutionary scenario along the species tree. Just as with reconciliation methods, this can be achieved through parsimonious or probabilistic estimation of the number of gain and loss events. Models can be complexified by adding processes, like the truncation of genes, but also by modelling the heterogeneity of rates of gain and loss across lineages and/or gene families.


Clusters of polymorphic sites

Genes are commonly regarded as the basic units transferred through an HGT event. However it is also possible for HGT to occur within genes. For example, it has been shown that horizontal transfer between closely related species results in more exchange of
ORF ORF or Orf may refer to: * Norfolk International Airport, IATA airport code ORF * Observer Research Foundation, an Indian research institute * One Race Films, a film production company founded by Vin Diesel * Open reading frame, a portion of the ...
fragments, a type a transfer called
gene conversion Gene conversion is the process by which one DNA sequence replaces a homologous sequence such that the sequences become identical after the conversion event. Gene conversion can be either allelic, meaning that one allele of the same gene replaces a ...
, mediated by homologous recombination. The analysis of a group of four ''Escherichia coli'' and two ''
Shigella flexneri ''Shigella flexneri'' is a species of Gram-negative bacteria in the genus ''Shigella'' that can cause diarrhea in humans. Several different serogroups of ''Shigella'' are described; ''S. flexneri'' belongs to group ''B''. ''S. flexneri'' infecti ...
'' strains revealed that the sequence stretches common to all six strains contain polymorphic sites, consequences of homologous recombination. Clusters of excess of polymorphic sites can thus be used to detect tracks of DNA recombined with a distant relative. This method of detection is, however, restricted to the sites in common to all analysed sequences, limiting the analysis to a group of closely related organisms.


Evaluation

The existence of the numerous and varied methods to infer HGT raises the question of how to validate individual inferences and of how to compare the different methods. A main problem is that, as with other types of phylogenetic inferences, the actual evolutionary history cannot be established with certainty. As a result, it is difficult to obtain a representative
test set In machine learning, a common task is the study and construction of algorithms that can learn from and make predictions on data. Such algorithms function by making data-driven predictions or decisions, through building a mathematical model from ...
of HGT events. Furthermore, HGT inference methods vary considerably in the information they consider and often identify inconsistent groups of HGT candidates: it is not clear to what extent taking the intersection, the
union Union commonly refers to: * Trade union, an organization of workers * Union (set theory), in mathematics, a fundamental operation on sets Union may also refer to: Arts and entertainment Music * Union (band), an American rock group ** ''Un ...
, or some other combination of the individual methods affects the
false positive A false positive is an error in binary classification in which a test result incorrectly indicates the presence of a condition (such as a disease when the disease is not present), while a false negative is the opposite error, where the test resul ...
and
false negative A false positive is an error in binary classification in which a test result incorrectly indicates the presence of a condition (such as a disease when the disease is not present), while a false negative is the opposite error, where the test resul ...
rates. Parametric and phylogenetic methods draw on different sources of information; it is therefore difficult to make general statements about their relative performance. Conceptual arguments can however be invoked. While parametric methods are limited to the analysis of single or pairs of genomes, phylogenetic methods provide a natural framework to take advantage of the information contained in multiple genomes. In many cases, segments of genomes inferred as HGT based on their anomalous composition can also be recognised as such on the basis of phylogenetic analyses or through their mere absence in genomes of related organisms. In addition, phylogenetic methods rely on explicit models of sequence evolution, which provide a well-understood framework for parameter inference, hypothesis testing, and model selection. This is reflected in the literature, which tends to favour phylogenetic methods as the standard of proof for HGT. The use of phylogenetic methods thus appears to be the preferred standard, especially given that the increase in computational power coupled with algorithmic improvements has made them more tractable, and that the ever denser sampling of genomes lends more power to these tests. Considering phylogenetic methods, several approaches to validating individual HGT inferences and benchmarking methods have been adopted, typically relying on various forms of
simulation A simulation is the imitation of the operation of a real-world process or system over time. Simulations require the use of models; the model represents the key characteristics or behaviors of the selected system or process, whereas the s ...
. Because the truth is known in simulation, the number of false positives and the number of false negatives are straightforward to compute. However, simulating data do not trivially resolve the problem because the true extent of HGT in nature remains largely unknown, and specifying rates of HGT in the simulated model is always hasardous. Nonetheless, studies involving the comparison of several phylogenetic methods in a simulation framework could provide quantitative assessment of their respective performances, and thus help the biologist in choosing objectively proper tools. Standard tools to simulate sequence evolution along trees such as INDELible or PhyloSim can be adapted to simulate HGT. HGT events cause the relevant gene trees to conflict with the species tree. Such HGT events can be simulated through subtree pruning and regrafting rearrangements of the species tree. However, it is important to simulate data that are realistic enough to be representative of the challenge provided by real datasets, and simulation under complex models are thus preferable. A model was developed to simulate gene trees with heterogeneous substitution processes in addition to the occurrence of transfer, and accounting for the fact that transfer can come from now extinct donor lineages. Alternatively, the genome evolution simulator ALF directly generates gene families subject to HGT, by accounting for a whole range of evolutionary forces at the base level, but in the context of a complete genome. Given simulated sequences which have HGT, analysis of those sequences using the methods of interest and comparison of their results with the known truth permits study of their performance. Similarly, testing the methods on sequence known not to have HGT enables the study of false positive rates. Simulation of HGT events can also be performed by manipulating the biological sequences themselves. Artificial chimeric genomes can be obtained by inserting known foreign genes into random positions of a host genome. The donor sequences are inserted into the host unchanged or can be further evolved by simulation, e.g., using the tools described above. One important caveat to simulation as a way to assess different methods is that simulation is based on strong simplifying assumptions which may favour particular methods.


See also

* Index of evolutionary biology articles *
Horizontal gene transfer Horizontal gene transfer (HGT) or lateral gene transfer (LGT) is the movement of genetic material between unicellular and/or multicellular organisms other than by the ("vertical") transmission of DNA from parent to offspring (reproduction). H ...
* Horizontal gene transfer in evolution * Phylogenetic tree *
Phylogenetic network A phylogenetic network is any graph used to visualize evolutionary relationships (either abstractly or explicitly) between nucleotide sequences, genes, chromosomes, genomes, or species. They are employed when reticulation events such as hybrid ...
* * Bioinformatics * Comparative genomics *
Homology (biology) In biology, homology is similarity due to shared ancestry between a pair of structures or genes in different taxa. A common example of homologous structures is the forelimbs of vertebrates, where the wings of bats and birds, the arms of prima ...


References

{{Reflist, 35em, refs = {{cite journal , vauthors = Abby SS, Tannier E, Gouy M, Daubin V , title = Detecting lateral gene transfers by statistical reconciliation of phylogenetic forests , journal = BMC Bioinformatics , volume = 11 , pages = 324 , date = June 2010 , pmid = 20550700 , pmc = 2905365 , doi = 10.1186/1471-2105-11-324 {{Cite journal , doi = 10.1007/s00026-001-8006-8, title = Subtree Transfer Operations and Their Induced Metrics on Evolutionary Trees, journal = Annals of Combinatorics, volume = 5, pages = 1–15, year = 2001, vauthors = Allen BL, Steel M , s2cid = 2934442, citeseerx = 10.1.1.24.8389 {{Cite book , pmid = 22407712 , year = 2012 , vauthors = Altenhoff AM, Dessimoz C, title = Evolutionary Genomics , volume = 855 , pages = 259–79 , doi = 10.1007/978-1-61779-582-4_9 , chapter = Inferring Orthology and Paralogy , series = Methods in Molecular Biology , isbn = 978-1-61779-581-7 , chapter-url = http://discovery.ucl.ac.uk/10078014/1/Altenhoff2019_Protocol_InferringOrthologyAndParalogy.pdf {{cite journal , vauthors = Azad RK, Lawrence JG , title = Use of artificial genomes in assessing methods for atypical gene detection , journal = PLOS Computational Biology , volume = 1 , issue = 6 , pages = e56 , date = November 2005 , pmid = 16292353 , pmc = 1282332 , doi = 10.1371/journal.pcbi.0010056 , bibcode = 2005PLSCB...1...56A {{cite journal , vauthors = Azad RK, Lawrence JG , title = Towards more robust methods of alien gene detection , journal = Nucleic Acids Research , volume = 39 , issue = 9 , pages = e56 , date = May 2011 , pmid = 21297116 , pmc = 3089488 , doi = 10.1093/nar/gkr059 {{cite journal , vauthors = Becq J, Churlaud C, Deschavanne P , title = A benchmark of parametric methods for horizontal transfers detection , journal = PLOS ONE , volume = 5 , issue = 4 , pages = e9989 , date = April 2010 , pmid = 20376325 , pmc = 2848678 , doi = 10.1371/journal.pone.0009989 , bibcode = 2010PLoSO...5.9989B , doi-access = free {{cite journal , vauthors = Beiko RG, Hamilton N , title = Phylogenetic identification of lateral genetic transfer events , journal = BMC Evolutionary Biology , volume = 6 , pages = 15 , date = February 2006 , pmid = 16472400 , pmc = 1431587 , doi = 10.1186/1471-2148-6-15 {{cite journal , vauthors = Beiko RG, Harlow TJ, Ragan MA , title = Highways of gene sharing in prokaryotes , journal = Proceedings of the National Academy of Sciences of the United States of America , volume = 102 , issue = 40 , pages = 14332–7 , date = October 2005 , pmid = 16176988 , pmc = 1242295 , doi = 10.1073/pnas.0504068102 , bibcode = 2005PNAS..10214332B , doi-access = free {{cite journal , vauthors = Bentley SD, Parkhill J , s2cid = 5524251 , title = Comparative genomic structure of prokaryotes , journal = Annual Review of Genetics , volume = 38 , pages = 771–92 , year = 2004 , pmid = 15568993 , doi = 10.1146/annurev.genet.38.072902.094318 {{cite journal , vauthors = Clarke GD, Beiko RG, Ragan MA, Charlebois RL , title = Inferring genome trees by using a filter to eliminate phylogenetically discordant sequences and a distance matrix based on mean normalized BLASTP scores , journal = Journal of Bacteriology , volume = 184 , issue = 8 , pages = 2072–80 , date = April 2002 , pmid = 11914337 , pmc = 134965 , doi = 10.1128/jb.184.8.2072-2080.2002 {{cite journal , vauthors = Cortez DQ, Lazcano A, Becerra A , title = Comparative analysis of methodologies for the detection of horizontally transferred genes: a reassessment of first-order Markov models , journal = In Silico Biology , volume = 5 , issue = 5–6 , pages = 581–92 , year = 2005 , pmid = 16610135 {{cite journal , vauthors = Cortez D, Forterre P, Gribaldo S , title = A hidden reservoir of integrative elements is the major source of recently acquired foreign genes and ORFans in archaeal and bacterial genomes , journal = Genome Biology , volume = 10 , issue = 6 , pages = R65 , year = 2009 , pmid = 19531232 , pmc = 2718499 , doi = 10.1186/gb-2009-10-6-r65 {{cite journal , vauthors = Dalquen DA, Anisimova M, Gonnet GH, Dessimoz C , title = ALF--a simulation framework for genome evolution , journal = Molecular Biology and Evolution , volume = 29 , issue = 4 , pages = 1115–23 , date = April 2012 , pmid = 22160766 , pmc = 3341827 , doi = 10.1093/molbev/msr268 {{cite journal , vauthors = Daubin V, Lerat E, Perrière G , title = The source of laterally transferred genes in bacterial genomes , journal = Genome Biology , volume = 4 , issue = 9 , pages = R57 , year = 2003 , pmid = 12952536 , pmc = 193657 , doi = 10.1186/gb-2003-4-9-r57 {{cite journal , vauthors = Deschavanne P, Filipski J , title = Correlation of GC content with replication timing and repair mechanisms in weakly expressed E.coli genes , journal = Nucleic Acids Research , volume = 23 , issue = 8 , pages = 1350–3 , date = April 1995 , pmid = 7753625 , pmc = 306860 , doi = 10.1093/nar/23.8.1350 {{cite journal , vauthors = Deschavanne PJ, Giron A, Vilain J, Fagot G, Fertil B , title = Genomic signature: characterization and classification of species assessed by chaos game representation of sequences , journal = Molecular Biology and Evolution , volume = 16 , issue = 10 , pages = 1391–9 , date = October 1999 , pmid = 10563018 , doi = 10.1093/oxfordjournals.molbev.a026048 , doi-access = free {{Cite book , doi = 10.1007/978-3-540-78839-3_27, chapter = DLIGHT – Lateral Gene Transfer Detection Using Pairwise Evolutionary Distances in a Statistical Framework, title = Research in Computational Molecular Biology, volume = 4955, pages = 315–330, series = Lecture Notes in Computer Science, year = 2008, vauthors = Dessimoz C, Margadant D, Gonnet GH , s2cid = 12776750, isbn = 978-3-540-78838-6 {{cite journal , vauthors = Dufraigne C, Fertil B, Lespinats S, Giron A, Deschavanne P , title = Detection and characterization of horizontal transfers in prokaryotes using genomic signature , journal = Nucleic Acids Research , volume = 33 , issue = 1 , pages = e6 , date = January 2005 , pmid = 15653627 , pmc = 546175 , doi = 10.1093/nar/gni004 {{cite journal , vauthors = Fletcher W, Yang Z , title = INDELible: a flexible simulator of biological sequence evolution , journal = Molecular Biology and Evolution , volume = 26 , issue = 8 , pages = 1879–88 , date = August 2009 , pmid = 19423664 , pmc = 2712615 , doi = 10.1093/molbev/msp098 {{cite journal , vauthors = Goldman N, Anderson JP, Rodrigo AG , title = Likelihood-based tests of topologies in phylogenetics , journal = Systematic Biology , volume = 49 , issue = 4 , pages = 652–70 , date = December 2000 , pmid = 12116432 , doi = 10.1080/106351500750049752 , doi-access = free {{cite journal , vauthors = Gophna U, Charlebois RL, Doolittle WF , title = Ancient lateral gene transfer in the evolution of Bdellovibrio bacteriovorus , journal = Trends in Microbiology , volume = 14 , issue = 2 , pages = 64–9 , date = February 2006 , pmid = 16413191 , doi = 10.1016/j.tim.2005.12.008 {{cite journal , vauthors = Griffith F , title = The Significance of Pneumococcal Types , journal = The Journal of Hygiene , volume = 27 , issue = 2 , pages = 113–59 , date = January 1928 , pmid = 20474956 , pmc = 2167760 , doi = 10.1017/s0022172400031879 {{cite journal , vauthors = Guindon S, Perrière G , title = Intragenomic base content variation is a potential source of biases when searching for horizontally transferred genes , journal = Molecular Biology and Evolution , volume = 18 , issue = 9 , pages = 1838–40 , date = September 2001 , pmid = 11504864 , doi = 10.1093/oxfordjournals.molbev.a003972 , doi-access = free {{cite journal , vauthors = Hacker J, Blum-Oehler G, Mühldorfer I, Tschäpe H , s2cid = 27524815 , title = Pathogenicity islands of virulent bacteria: structure, function and impact on microbial evolution , journal = Molecular Microbiology , volume = 23 , issue = 6 , pages = 1089–97 , date = March 1997 , pmid = 9106201 , doi = 10.1046/j.1365-2958.1997.3101672.x , doi-access = free {{cite journal , vauthors = el Hassan MA, Calladine CR , title = Propeller-twisting of base-pairs and the conformational mobility of dinucleotide steps in DNA , journal = Journal of Molecular Biology , volume = 259 , issue = 1 , pages = 95–103 , date = May 1996 , pmid = 8648652 , doi = 10.1006/jmbi.1996.0304 {{Cite journal , doi = 10.1016/S0166-218X(96)00062-5, title = On the complexity of comparing evolutionary trees, journal = Discrete Applied Mathematics, volume = 71, issue = 1–3, pages = 153–169, year = 1996, vauthors = Hein J, Jiang T, Wang L, Zhang K , doi-access = free {{cite journal , vauthors = Herzel H, Weiss O, Trifonov EN , title = 10-11 bp periodicities in complete genomes reflect protein structure and DNA folding , journal = Bioinformatics , volume = 15 , issue = 3 , pages = 187–93 , date = March 1999 , pmid = 10222405 , doi = 10.1093/bioinformatics/15.3.187 , doi-access = free {{cite journal , vauthors = Hickey G, Dehne F, Rau-Chaplin A, Blouin C , title = SPR distance computation for unrooted trees , journal = Evolutionary Bioinformatics Online , volume = 4 , pages = 17–27 , date = February 2008 , pmid = 19204804 , pmc = 2614206 , doi = 10.4137/ebo.s419 {{cite journal , vauthors = Hooper SD, Berg OG , s2cid = 6872232 , title = Detection of genes with atypical nucleotide sequence in microbial genomes , journal = Journal of Molecular Evolution , volume = 54 , issue = 3 , pages = 365–75 , date = March 2002 , pmid = 11847562 , doi = 10.1007/s00239-001-0051-8 , bibcode = 2002JMolE..54..365H {{Cite book , pmid = 24170395 , year = 2014 , vauthors = Iantorno S, Gori K, Goldman N, Gil M, Dessimoz C , title = Multiple Sequence Alignment Methods , volume = 1079 , pages = 59–73 , s2cid = 2363657 , doi = 10.1007/978-1-62703-646-7_4 , chapter = Who Watches the Watchmen? An Appraisal of Benchmarks for Multiple Sequence Alignment , series = Methods in Molecular Biology , isbn = 978-1-62703-645-0 , arxiv = 1211.2160 {{cite journal , vauthors = Karlin S, Burge C , title = Dinucleotide relative abundance extremes: a genomic signature , journal = Trends in Genetics , volume = 11 , issue = 7 , pages = 283–90 , date = July 1995 , pmid = 7482779 , doi = 10.1016/S0168-9525(00)89076-9 {{cite journal , vauthors = Koski LB, Golding GB , s2cid = 24848333 , title = The closest BLAST hit is often not the nearest neighbor , journal = Journal of Molecular Evolution , volume = 52 , issue = 6 , pages = 540–2 , date = June 2001 , pmid = 11443357 , doi = 10.1007/s002390010184 , bibcode = 2001JMolE..52..540K {{cite journal , vauthors = Langille MG, Hsiao WW, Brinkman FS , s2cid = 2373228 , title = Detecting genomic islands using bioinformatics approaches , journal = Nature Reviews. Microbiology , volume = 8 , issue = 5 , pages = 373–82 , date = May 2010 , pmid = 20395967 , doi = 10.1038/nrmicro2350 {{cite journal , vauthors = Lawrence JG, Ochman H , s2cid = 7928957 , title = Amelioration of bacterial genomes: rates of change and exchange , journal = Journal of Molecular Evolution , volume = 44 , issue = 4 , pages = 383–97 , date = April 1997 , pmid = 9089078 , doi = 10.1007/pl00006158 , citeseerx = 10.1.1.590.7214 , bibcode = 1997JMolE..44..383L {{cite journal , vauthors = Lawrence JG, Ochman H , title = Molecular archaeology of the Escherichia coli genome , journal = Proceedings of the National Academy of Sciences of the United States of America , volume = 95 , issue = 16 , pages = 9413–7 , date = August 1998 , pmid = 9689094 , pmc = 21352 , doi = 10.1073/pnas.95.16.9413 , bibcode = 1998PNAS...95.9413L , doi-access = free {{cite journal , vauthors = Lawrence JG, Hartl DL , title = Inference of horizontal genetic transfer from molecular data: an approach using the bootstrap , journal = Genetics , volume = 131 , issue = 3 , pages = 753–60 , date = July 1992 , doi = 10.1093/genetics/131.3.753 , pmid = 1628816 , pmc = 1205046 {{cite journal , vauthors = Lawrence JG, Ochman H , title = Reconciling the many faces of lateral gene transfer , journal = Trends in Microbiology , volume = 10 , issue = 1 , pages = 1–4 , date = January 2002 , pmid = 11755071 , doi = 10.1016/s0966-842x(01)02282-x {{cite journal , vauthors = Tatum EL, Lederberg J , title = Gene Recombination in the Bacterium Escherichia coli , journal = Journal of Bacteriology , volume = 53 , issue = 6 , pages = 673–84 , date = June 1947 , pmid = 16561324 , pmc = 518375 , doi = 10.1128/JB.53.6.673-684.1947 {{cite journal , vauthors = Lerat E, Daubin V, Moran NA , title = From gene trees to organismal phylogeny in prokaryotes: the case of the gamma-Proteobacteria , journal = PLOS Biology , volume = 1 , issue = 1 , pages = E19 , date = October 2003 , pmid = 12975657 , pmc = 193605 , doi = 10.1371/journal.pbio.0000019 {{cite journal , vauthors = Liu Z, Venkatesh SS, Maley CC , title = Sequence space coverage, entropy of genomes and the potential to detect non-human DNA in human samples , journal = BMC Genomics , volume = 9 , pages = 509 , date = October 2008 , pmid = 18973670 , pmc = 2628393 , doi = 10.1186/1471-2164-9-509 {{cite journal , vauthors = MacLeod D, Charlebois RL, Doolittle F, Bapteste E , title = Deduction of probable events of lateral gene transfer through comparison of phylogenetic trees by recursive consolidation and rearrangement , journal = BMC Evolutionary Biology , volume = 5 , pages = 27 , date = April 2005 , pmid = 15819979 , pmc = 1087482 , doi = 10.1186/1471-2148-5-27 {{cite journal , vauthors = Mau B, Glasner JD, Darling AE, Perna NT , title = Genome-wide detection and analysis of homologous recombination among sequenced strains of Escherichia coli , journal = Genome Biology , volume = 7 , issue = 5 , pages = R44 , year = 2006 , pmid = 16737554 , pmc = 1779527 , doi = 10.1186/gb-2006-7-5-r44 {{cite journal , vauthors = Nakamura Y, Itoh T, Matsuda H, Gojobori T , title = Biased biological functions of horizontally transferred genes in prokaryotic genomes , journal = Nature Genetics , volume = 36 , issue = 7 , pages = 760–6 , date = July 2004 , pmid = 15208628 , doi = 10.1038/ng1381 , doi-access = free {{cite journal , vauthors = Nelson KE, Clayton RA, Gill SR, Gwinn ML, Dodson RJ, Haft DH, Hickey EK, Peterson JD, Nelson WC, Ketchum KA, McDonald L, Utterback TR, Malek JA, Linher KD, Garrett MM, Stewart AM, Cotton MD, Pratt MS, Phillips CA, Richardson D, Heidelberg J, Sutton GG, Fleischmann RD, Eisen JA, White O, Salzberg SL, Smith HO, Venter JC, Fraser CM , s2cid = 4420157 , display-authors = 6 , title = Evidence for lateral gene transfer between Archaea and bacteria from genome sequence of Thermotoga maritima , journal = Nature , volume = 399 , issue = 6734 , pages = 323–9 , date = May 1999 , pmid = 10360571 , doi = 10.1038/20601 , bibcode = 1999Natur.399..323N {{cite journal , vauthors = Normand P, Lapierre P, Tisa LS, Gogarten JP, Alloisio N, Bagnarol E, Bassi CA, Berry AM, Bickhart DM, Choisne N, Couloux A, Cournoyer B, Cruveiller S, Daubin V, Demange N, Francino MP, Goltsman E, Huang Y, Kopp OR, Labarre L, Lapidus A, Lavire C, Marechal J, Martinez M, Mastronunzio JE, Mullin BC, Niemann J, Pujic P, Rawnsley T, Rouy Z, Schenowitz C, Sellstedt A, Tavares F, Tomkins JP, Vallenet D, Valverde C, Wall LG, Wang Y, Medigue C, Benson DR , display-authors = 6 , title = Genome characteristics of facultatively symbiotic Frankia sp. strains reflect host range and host plant biogeography , journal = Genome Research , volume = 17 , issue = 1 , pages = 7–15 , date = January 2007 , pmid = 17151343 , pmc = 1716269 , doi = 10.1101/gr.5798407 {{cite journal , vauthors = Novichkov PS, Omelchenko MV, Gelfand MS, Mironov AA, Wolf YI, Koonin EV , title = Genome-wide molecular clock and horizontal gene transfer in bacterial evolution , journal = Journal of Bacteriology , volume = 186 , issue = 19 , pages = 6575–85 , date = October 2004 , pmid = 15375139 , pmc = 516599 , doi = 10.1128/JB.186.19.6575-6585.2004 {{cite journal , vauthors = Ochman H, Lawrence JG, Groisman EA , s2cid = 85739173 , title = Lateral gene transfer and the nature of bacterial innovation , journal = Nature , volume = 405 , issue = 6784 , pages = 299–304 , date = May 2000 , pmid = 10830951 , doi = 10.1038/35012500 , bibcode = 2000Natur.405..299O {{cite journal , vauthors = Olson WK, Gorin AA, Lu XJ, Hock LM, Zhurkin VB , title = DNA sequence-dependent deformability deduced from protein-DNA crystal complexes , journal = Proceedings of the National Academy of Sciences of the United States of America , volume = 95 , issue = 19 , pages = 11163–8 , date = September 1998 , pmid = 9736707 , pmc = 21613 , doi = 10.1073/pnas.95.19.11163 , bibcode = 1998PNAS...9511163O , doi-access = free {{cite journal , vauthors = Ornstein RL, Rein R , title = An optimized potential function for the calculation of nucleic acid interaction energies I. base stacking , journal = Biopolymers , volume = 17 , issue = 10 , pages = 2341–60 , date = October 1978 , pmid = 24624489 , doi = 10.1002/bip.1978.360171005 , s2cid = 13063636 {{cite journal , vauthors = Papke RT, Koenig JE, Rodríguez-Valera F, Doolittle WF , s2cid = 21595153 , title = Frequent recombination in a saltern population of Halorubrum , journal = Science , volume = 306 , issue = 5703 , pages = 1928–9 , date = December 2004 , pmid = 15591201 , doi = 10.1126/science.1103289 , bibcode = 2004Sci...306.1928P {{cite journal , vauthors = Pellegrini M, Marcotte EM, Thompson MJ, Eisenberg D, Yeates TO , title = Assigning protein functions by comparative genome analysis: protein phylogenetic profiles , journal = Proceedings of the National Academy of Sciences of the United States of America , volume = 96 , issue = 8 , pages = 4285–8 , date = April 1999 , pmid = 10200254 , pmc = 16324 , doi = 10.1073/pnas.96.8.4285 , bibcode = 1999PNAS...96.4285P , doi-access = free {{cite journal , vauthors = Poptsova MS, Gogarten JP , title = The power of phylogenetic approaches to detect horizontally transferred genes , journal = BMC Evolutionary Biology , volume = 7 , pages = 45 , date = March 2007 , pmid = 17376230 , pmc = 1847511 , doi = 10.1186/1471-2148-7-45 {{Cite book , pmid = 19271188 , year = 2009 , vauthors = Poptsova M , title = Horizontal Gene Transfer , volume = 532 , pages = 227–40 , doi = 10.1007/978-1-60327-853-9_13 , chapter = Testing Phylogenetic Methods to Identify Horizontal Gene Transfer , series = Methods in Molecular Biology , isbn = 978-1-60327-852-2 {{cite journal , vauthors = Ragan MA , title = On surrogate methods for detecting lateral gene transfer , journal = FEMS Microbiology Letters , volume = 201 , issue = 2 , pages = 187–91 , date = July 2001 , pmid = 11470360 , doi = 10.1111/j.1574-6968.2001.tb10755.x , doi-access = free {{cite journal , vauthors = Rendulic S, Jagtap P, Rosinus A, Eppinger M, Baar C, Lanz C, Keller H, Lambert C, Evans KJ, Goesmann A, Meyer F, Sockett RE, Schuster SC , s2cid = 38154836 , display-authors = 6 , title = A predator unmasked: life cycle of Bdellovibrio bacteriovorus from a genomic perspective , journal = Science , volume = 303 , issue = 5658 , pages = 689–92 , date = January 2004 , pmid = 14752164 , doi = 10.1126/science.1093027 , bibcode = 2004Sci...303..689R {{cite journal , vauthors = Shimodaira H, Hasegawa M , year = 1999 , title = Multiple Comparisons of Log-Likelihoods with Applications to Phylogenetic Inference , journal = Molecular Biology and Evolution , volume = 16 , issue = 8, pages = 1114–1116 , doi=10.1093/oxfordjournals.molbev.a026201, doi-access = free {{cite journal , vauthors = Shimodaira H , s2cid = 11586099 , title = An approximately unbiased test of phylogenetic tree selection , journal = Systematic Biology , volume = 51 , issue = 3 , pages = 492–508 , date = June 2002 , pmid = 12079646 , doi = 10.1080/10635150290069913 {{cite journal , vauthors = Sipos B, Massingham T, Jordan GE, Goldman N , title = PhyloSim - Monte Carlo simulation of sequence evolution in the R statistical computing environment , journal = BMC Bioinformatics , volume = 12 , pages = 104 , date = April 2011 , pmid = 21504561 , pmc = 3102636 , doi = 10.1186/1471-2105-12-104 {{cite journal , vauthors = Vernikos GS, Thomson NR, Parkhill J , title = Genetic flux over time in the Salmonella lineage , journal = Genome Biology , volume = 8 , issue = 6 , pages = R100 , year = 2007 , pmid = 17547764 , pmc = 2394748 , doi = 10.1186/gb-2007-8-6-r100 {{cite journal , vauthors = Welch RA, Burland V, Plunkett G, Redford P, Roesch P, Rasko D, Buckles EL, Liou SR, Boutin A, Hackett J, Stroud D, Mayhew GF, Rose DJ, Zhou S, Schwartz DC, Perna NT, Mobley HL, Donnenberg MS, Blattner FR , display-authors = 6 , title = Extensive mosaic structure revealed by the complete genome sequence of uropathogenic Escherichia coli , journal = Proceedings of the National Academy of Sciences of the United States of America , volume = 99 , issue = 26 , pages = 17020–4 , date = December 2002 , pmid = 12471157 , pmc = 139262 , doi = 10.1073/pnas.252529799 , bibcode = 2002PNAS...9917020W , doi-access = free {{cite journal , vauthors = Wuitschick JD, Karrer KM , title = Analysis of genomic G + C content, codon usage, initiator codon context and translation termination sites in Tetrahymena thermophila , journal = The Journal of Eukaryotic Microbiology , volume = 46 , issue = 3 , pages = 239–47 , year = 1999 , pmid = 10377985 , doi = 10.1111/j.1550-7408.1999.tb05120.x , s2cid = 28836138 {{cite journal , vauthors = Worning P, Jensen LJ, Nelson KE, Brunak S, Ussery DW , title = Structural analysis of DNA sequence: evidence for lateral gene transfer in Thermotoga maritima , journal = Nucleic Acids Research , volume = 28 , issue = 3 , pages = 706–9 , date = February 2000 , pmid = 10637321 , pmc = 102551 , doi = 10.1093/nar/28.3.706 {{cite journal , vauthors = Xiong D, Xiao F, Liu L, Hu K, Tan Y, He S, Gao X , title = Towards a better detection of horizontally transferred genes by combining unusual properties effectively , journal = PLOS ONE , volume = 7 , issue = 8 , pages = e43126 , year = 2012 , pmid = 22905214 , pmc = 3419211 , doi = 10.1371/journal.pone.0043126 , bibcode = 2012PLoSO...743126X , doi-access = free {{cite journal , vauthors = Zhaxybayeva O, Hamel L, Raymond J, Gogarten JP , title = Visualization of the phylogenetic content of five genomes using dekapentagonal maps , journal = Genome Biology , volume = 5 , issue = 3 , pages = R20 , year = 2004 , pmid = 15003123 , pmc = 395770 , doi = 10.1186/gb-2004-5-3-r20 {{cite journal , vauthors = Zhaxybayeva O, Gogarten JP, Charlebois RL, Doolittle WF, Papke RT , title = Phylogenetic analyses of cyanobacterial genomes: quantification of horizontal gene transfer events , journal = Genome Research , volume = 16 , issue = 9 , pages = 1099–108 , date = September 2006 , pmid = 16899658 , pmc = 1557764 , doi = 10.1101/gr.5322306 {{cite journal , vauthors = Zinder ND, Lederberg J , title = Genetic exchange in Salmonella , journal = Journal of Bacteriology , volume = 64 , issue = 5 , pages = 679–99 , date = November 1952 , pmid = 12999698 , pmc = 169409 , doi = 10.1128/JB.64.5.679-699.1952 , author-link1 = Norton Zinder , author-link2 = Joshua Lederberg Zuckerkandl, E. and Pauling, L.B. 1965. Evolutionary divergence and convergence in proteins. In Bryson, V.and Vogel, H.J. (editors). Evolving Genes and Proteins. Academic Press, New York. pp. 97–166. {{cite journal , vauthors = Vernikos GS, Parkhill J , title = Interpolated variable order motifs for identification of horizontally acquired DNA: revisiting the Salmonella pathogenicity islands , journal = Bioinformatics , volume = 22 , issue = 18 , pages = 2196–203 , date = September 2006 , pmid = 16837528 , doi = 10.1093/bioinformatics/btl369 , doi-access = free {{cite journal , vauthors = Vernikos GS, Parkhill J , title = Resolving the structural features of genomic islands: a machine learning approach , journal = Genome Research , volume = 18 , issue = 2 , pages = 331–42 , date = February 2008 , pmid = 18071028 , pmc = 2203631 , doi = 10.1101/gr.7004508 {{cite journal , vauthors = Wisniewski-Dyé F, Borziak K, Khalsa-Moyers G, Alexandre G, Sukharnikov LO, Wuichet K, Hurst GB, McDonald WH, Robertson JS, Barbe V, Calteau A, Rouy Z, Mangenot S, Prigent-Combaret C, Normand P, Boyer M, Siguier P, Dessaux Y, Elmerich C, Condemine G, Krishnen G, Kennedy I, Paterson AH, González V, Mavingui P, Zhulin IB , display-authors = 6 , title = Azospirillum genomes reveal transition of bacteria from aquatic to terrestrial environments , journal = PLOS Genetics , volume = 7 , issue = 12 , pages = e1002430 , date = December 2011 , pmid = 22216014 , pmc = 3245306 , doi = 10.1371/journal.pgen.1002430 , veditors = Richardson PM {{cite journal , vauthors = David LA, Alm EJ , s2cid = 4420725 , title = Rapid evolutionary innovation during an Archaean genetic expansion , journal = Nature , volume = 469 , issue = 7328 , pages = 93–6 , date = January 2011 , pmid = 21170026 , doi = 10.1038/nature09649 , url = https://dspace.mit.edu/bitstream/1721.1/61263/1/Alm.Main.pdf , bibcode = 2011Natur.469...93D , hdl = 1721.1/61263 , hdl-access = free {{cite journal , vauthors = Szöllosi GJ, Boussau B, Abby SS, Tannier E, Daubin V , title = Phylogenetic modeling of lateral gene transfer reconstructs the pattern and relative timing of speciations , journal = Proceedings of the National Academy of Sciences of the United States of America , volume = 109 , issue = 43 , pages = 17513–8 , date = October 2012 , pmid = 23043116 , pmc = 3491530 , doi = 10.1073/pnas.1202997109 , bibcode = 2012PNAS..10917513S , doi-access = free {{cite journal , vauthors = Doyon JP, Hamel S, Chauve C , s2cid = 2493991 , title = An efficient method for exploring the space of gene tree/species tree reconciliations in a probabilistic framework , journal = IEEE/ACM Transactions on Computational Biology and Bioinformatics , volume = 9 , issue = 1 , pages = 26–39 , year = 2012 , pmid = 21464510 , doi = 10.1109/TCBB.2011.64 , url = https://hal-lirmm.ccsd.cnrs.fr/lirmm-00448486/file/RR-10002.pdf {{cite journal , vauthors = Nguyen TH, Ranwez V, Pointet S, Chifolleau AM, Doyon JP, Berry V , title = Reconciliation and local gene tree rearrangement can be of mutual profit , journal = Algorithms for Molecular Biology , volume = 8 , issue = 1 , pages = 12 , date = April 2013 , pmid = 23566548 , pmc = 3871789 , doi = 10.1186/1748-7188-8-12 {{cite journal , vauthors = Bansal MS, Alm EJ, Kellis M , title = Efficient algorithms for the reconciliation problem with gene duplication, horizontal transfer and loss , journal = Bioinformatics , volume = 28 , issue = 12 , pages = i283-91 , date = June 2012 , pmid = 22689773 , pmc = 3371857 , doi = 10.1093/bioinformatics/bts225 {{cite journal , vauthors = Szöllõsi GJ, Rosikiewicz W, Boussau B, Tannier E, Daubin V , title = Efficient exploration of the space of reconciled gene trees , journal = Systematic Biology , volume = 62 , issue = 6 , pages = 901–12 , date = November 2013 , pmid = 23925510 , pmc = 3797637 , doi = 10.1093/sysbio/syt054 , arxiv = 1306.2167 , bibcode = 2013arXiv1306.2167S {{cite journal , vauthors = Majewski J, Zawadzki P, Pickerill P, Cohan FM, Dowson CG , title = Barriers to genetic exchange between bacterial species: Streptococcus pneumoniae transformation , journal = Journal of Bacteriology , volume = 182 , issue = 4 , pages = 1016–23 , date = February 2000 , pmid = 10648528 , pmc = 94378 , doi = 10.1128/jb.182.4.1016-1023.2000 {{cite journal , vauthors = Sjöstrand J, Tofigh A, Daubin V, Arvestad L, Sennblad B, Lagergren J , title = A Bayesian method for analyzing lateral gene transfer , journal = Systematic Biology , volume = 63 , issue = 3 , pages = 409–20 , date = May 2014 , pmid = 24562812 , doi = 10.1093/sysbio/syu007 , doi-access = free {{cite journal , vauthors = Szöllosi GJ, Tannier E, Lartillot N, Daubin V , title = Lateral gene transfer from the dead , journal = Systematic Biology , volume = 62 , issue = 3 , pages = 386–97 , date = May 2013 , pmid = 23355531 , pmc = 3622898 , doi = 10.1093/sysbio/syt003 , arxiv = 1211.4606 {{cite journal , vauthors = Szöllősi GJ, Tannier E, Daubin V, Boussau B , title = The inference of gene trees with species trees , journal = Systematic Biology , volume = 64 , issue = 1 , pages = e42-62 , date = January 2015 , pmid = 25070970 , pmc = 4265139 , doi = 10.1093/sysbio/syu048 {{Cite book , doi = 10.1007/978-3-540-87989-3_6, chapter = Ancestral Reconstruction by Asymmetric Wagner Parsimony over Continuous Characters and Squared Parsimony over Distributions, title = Comparative Genomics, volume = 5267, pages = 72–86, series = Lecture Notes in Computer Science, year = 2008, vauthors = Csűrös MS , isbn = 978-3-540-87988-6 {{cite journal , vauthors = Csurös M, Miklós I , title = Streamlining and large ancestral genomes in Archaea inferred with a phylogenetic birth-and-death model , journal = Molecular Biology and Evolution , volume = 26 , issue = 9 , pages = 2087–95 , date = September 2009 , pmid = 19570746 , pmc = 2726834 , doi = 10.1093/molbev/msp123 {{cite journal , vauthors = Pagel M , s2cid = 205034365 , title = Inferring the historical patterns of biological evolution , journal = Nature , volume = 401 , issue = 6756 , pages = 877–84 , date = October 1999 , pmid = 10553904 , doi = 10.1038/44766 , bibcode = 1999Natur.401..877P , hdl = 2027.42/148253 , hdl-access = free {{cite journal , vauthors = Didelot X, Falush D , title = Inference of bacterial microevolution using multilocus sequence data , journal = Genetics , volume = 175 , issue = 3 , pages = 1251–66 , date = March 2007 , pmid = 17151252 , pmc = 1840087 , doi = 10.1534/genetics.106.063305 {{cite journal , vauthors = Hiramatsu K, Cui L, Kuroda M, Ito T , title = The emergence and evolution of methicillin-resistant ''Staphylococcus aureus'' , journal = Trends in Microbiology , volume = 9 , issue = 10 , pages = 486–93 , date = October 2001 , pmid = 11597450 , doi = 10.1016/s0966-842x(01)02175-8 {{cite journal , vauthors = McCutcheon JP, Moran NA , title = Functional convergence in reduced genomes of bacterial symbionts spanning 200 My of evolution , journal = Genome Biology and Evolution , volume = 2 , pages = 708–18 , year = 2010 , pmid = 20829280 , pmc = 2953269 , doi = 10.1093/gbe/evq055 {{cite journal , vauthors = Ragan MA, Harlow TJ, Beiko RG , title = Do different surrogate methods detect lateral genetic transfer events of different relative ages? , journal = Trends in Microbiology , volume = 14 , issue = 1 , pages = 4–8 , date = January 2006 , pmid = 16356716 , doi = 10.1016/j.tim.2005.11.004 {{cite journal , vauthors = Galtier N , title = A model of horizontal gene transfer and the bacterial phylogeny problem , journal = Systematic Biology , volume = 56 , issue = 4 , pages = 633–42 , date = August 2007 , pmid = 17661231 , doi = 10.1080/10635150701546231 , doi-access = free {{cite journal , vauthors = Than C, Ruths D, Innan H, Nakhleh L , title = Confounding factors in HGT detection: statistical error, coalescent effects, and multiple solutions , journal = Journal of Computational Biology , volume = 14 , issue = 4 , pages = 517–35 , date = May 2007 , pmid = 17572027 , doi = 10.1089/cmb.2007.A010 , citeseerx = 10.1.1.121.7834 {{cite journal , vauthors = Whidden C, Zeh N, Beiko RG , title = Supertrees Based on the Subtree Prune-and-Regraft Distance , journal = Systematic Biology , volume = 63 , issue = 4 , pages = 566–81 , date = July 2014 , pmid = 24695589 , pmc = 4055872 , doi = 10.1093/sysbio/syu023 {{cite journal , vauthors = Hao W, Golding GB , title = The fate of laterally transferred genes: life in the fast lane to adaptation or death , journal = Genome Research , volume = 16 , issue = 5 , pages = 636–43 , date = May 2006 , pmid = 16651664 , pmc = 1457040 , doi = 10.1101/gr.4746406 {{cite journal , vauthors = Hao W, Golding GB , title = Uncovering rate variation of lateral gene transfer during bacterial genome evolution , journal = BMC Genomics , volume = 9 , pages = 235 , date = May 2008 , pmid = 18492275 , pmc = 2426709 , doi = 10.1186/1471-2164-9-235 {{cite journal , vauthors = Hao W, Golding GB , title = Inferring bacterial genome flux while considering truncated genes , journal = Genetics , volume = 186 , issue = 1 , pages = 411–26 , date = September 2010 , pmid = 20551435 , pmc = 2940306 , doi = 10.1534/genetics.110.118448 {{cite journal , vauthors = Haggerty LS, Jachiet PA, Hanage WP, Fitzpatrick DA, Lopez P, O'Connell MJ, Pisani D, Wilkinson M, Bapteste E, McInerney JO , display-authors = 6 , title = A pluralistic account of homology: adapting the models to the data , journal = Molecular Biology and Evolution , volume = 31 , issue = 3 , pages = 501–16 , date = March 2014 , pmid = 24273322 , pmc = 3935183 , doi = 10.1093/molbev/mst228 {{cite journal , vauthors = Bansal MS, Banay G, Gogarten JP, Shamir R , title = Detecting highways of horizontal gene transfer , journal = Journal of Computational Biology , volume = 18 , issue = 9 , pages = 1087–114 , date = September 2011 , pmid = 21899418 , doi = 10.1089/cmb.2011.0066 , citeseerx = 10.1.1.418.3658 {{cite journal , vauthors = Bansal MS, Banay G, Harlow TJ, Gogarten JP, Shamir R , title = Systematic inference of highways of horizontal gene transfer in prokaryotes , journal = Bioinformatics , volume = 29 , issue = 5 , pages = 571–9 , date = March 2013 , pmid = 23335015 , doi = 10.1093/bioinformatics/btt021 , doi-access = free Nakhleh L, Ruths DA, Wang L: RIATA-HGT: A Fast and Accurate Heuristic for Reconstructing Horizontal Gene Transfer. COCOON, August 16–29, 2005; Kunming 2005. Hallett MT, Lagergren J. RECOMB 2001. Montreal: ACM; 2001. Efficient Algorithms for Lateral Gene Transfer Problems; pp. 149–156. {{cite journal , vauthors = Kechris KJ, Lin JC, Bickel PJ, Glazer AN , title = Quantitative exploration of the occurrence of lateral gene transfer by using nitrogen fixation genes as a case study , journal = Proceedings of the National Academy of Sciences of the United States of America , volume = 103 , issue = 25 , pages = 9584–9 , date = June 2006 , pmid = 16769896 , pmc = 1480450 , doi = 10.1073/pnas.0603534103 , bibcode = 2006PNAS..103.9584K , doi-access = free {{cite journal , vauthors = Moran NA, Jarvik T , s2cid = 14785276 , title = Lateral transfer of genes from fungi underlies carotenoid production in aphids , journal = Science , volume = 328 , issue = 5978 , pages = 624–7 , date = April 2010 , pmid = 20431015 , doi = 10.1126/science.1187113 , url = https://semanticscholar.org/paper/3e062765bcc0196924a4973960b3a58eb2ae38f0 , bibcode = 2010Sci...328..624M {{cite journal , vauthors = Danchin EG, Rosso MN, Vieira P, de Almeida-Engler J, Coutinho PM, Henrissat B, Abad P , title = Multiple lateral gene transfers and duplications have promoted plant parasitism ability in nematodes , journal = Proceedings of the National Academy of Sciences of the United States of America , volume = 107 , issue = 41 , pages = 17651–6 , date = October 2010 , pmid = 20876108 , pmc = 2955110 , doi = 10.1073/pnas.1008486107 , bibcode = 2010PNAS..10717651D , doi-access = free {{cite journal , vauthors = Tsirigos A, Rigoutsos I , title = A new computational method for the detection of horizontal gene transfer events , journal = Nucleic Acids Research , volume = 33 , issue = 3 , pages = 922–33 , year = 2005 , pmid = 15716310 , pmc = 549390 , doi = 10.1093/nar/gki187 {{cite journal , vauthors = Baroni M, Grünewald S, Moulton V, Semple C , s2cid = 3180904 , title = Bounding the number of hybridisation events for a consistent evolutionary history , journal = Journal of Mathematical Biology , volume = 51 , issue = 2 , pages = 171–82 , date = August 2005 , pmid = 15868201 , doi = 10.1007/s00285-005-0315-9 , hdl-access = free , hdl = 10092/12222 {{cite journal , vauthors = Boc A, Philippe H, Makarenkov V , title = Inferring and validating horizontal gene transfer events using bipartition dissimilarity , journal = Systematic Biology , volume = 59 , issue = 2 , pages = 195–211 , date = March 2010 , pmid = 20525630 , doi = 10.1093/sysbio/syp103 , publisher = Oxford University Press , doi-access = free {{cite journal , vauthors = Boc A, Diallo AB, Makarenkov V , title = T-REX: a web server for inferring, validating and visualizing phylogenetic trees and networks , journal = Nucleic Acids Research , volume = 40 , pages = W573-9 , date = July 2012 , pmid = 22675075 , pmc = 3394261 , doi = 10.1093/nar/gks485 , publisher = Oxford University Press , issue = W1 Computational biology