Computing Phylogenetic Roots with Bounded Degrees and Errors

  • Author(s) / Creator(s)
  • Technical report TR03-06. Given a set of species and their similarity data, an important problem in evolutionary biology is how to reconstruct a phylogeny (also called evolutionary tree) so that species are close in the phylogeny if and only if they have high similarity. Assume that the similarity data are represented as a graph G = (V, E) where each vertex represents a species and two vertices are adjacent if they represent species of high similarity. The phylogeny reconstruction problem can then be abstracted as the problem of finding a (phylogenetic) tree T from the given graph G such that (1) T has no degree-2 internal nodes, (2) the external nodes (i.e. leaves) of T are exactly the elements of V, and (3) (u, v) ε E if and only if d_T(u, v) ≤ k for some fixed threshold k, where d_T(u,v) denotes the distance between u and v in tree T. This is called the Phylogenetic kth Root Problem (PRk), and such a tree T, if exists, is called a phylogenetic kth root of graph G. The computational complexity of PRk is open, except for k ≤ 4. In this paper, we investigate PRk under a natural restriction that the maximum degree of the phylogenetic root is bounded from above by a constant. Our main contribution is a linear-time algorithm that determines if G has such a phylogenetic kth root, and if so, demonstrates one. On the other hand, as in practice the collected similarity data are usually not perfect and may contain errors, we propose to study a generalized version of PRk where the output phylogeny is only required to be an approximate root of the input graph. We show that this and other related problems are computationally intractable. | TRID-ID TR03-06

  • Date created
  • Subjects / Keywords
  • Type of Item
  • DOI
  • License
    Attribution 3.0 International