ERA

Download the full-sized PDF of Whole Genome Phylogeny via Complete Composition VectorsDownload the full-sized PDF

Analytics

Share

Permanent link (DOI): https://doi.org/10.7939/R3FP2J

Download

Export to: EndNote  |  Zotero  |  Mendeley

Communities

This file is in the following communities:

Computing Science, Department of

Collections

This file is in the following collections:

Technical Reports (Computing Science)

Whole Genome Phylogeny via Complete Composition Vectors Open Access

Descriptions

Author or creator
Wu, Xiaomeng
Wan, Xiu-Feng
Gang, Wu
Xu, Dong
Lin, Guohui
Additional contributors
Subject/Keyword
evolutionary relationships
genome phylogeny
complete composition vector
Type of item
Computing Science Technical Report
Computing science technical report ID
TR05-06
Language
English
Place
Time
Description
Technical report TR05-06. The availability of complete genomic sequences allows us to infer the evolutionary footprints between species in a global strategy. However, the length of these genomic sequences poses a challenge on computational efficiency and optimality of information representation in phylogenetic analyses. In this paper, a new method called complete composition vector (CCV) is described to infer evolutionary relationships between species using their complete genomic sequences. In this method, the character string frequencies in the complete genomic sequence of each species are represented by a complete composition vector in a high-dimensional space. After being filtered out the random mutation background, cosines of the angles between the representing vectors are converted into pairwise evolutionary distances, based on which the phylogeny tree is constructed using the neighbor-joining algorithm. The method bypasses the complexity of performing multiple sequence alignments and avoids the ambiguity of choosing individual genes, whereas is expected to effectively retain the rich evolutionary information contained in the whole genomic sequence. To verify its strengths, the method was applied to infer the evolutionary footprints of coronaviruses and microbes. On a typical desktop PC, it took only one and half days to construct the phylogeny for 109 species containing 103 microbes and 6 eukaryotes. The phylogenetic trees generated by our method are highly consistent with those annotated by biologists.
Date created
2005
DOI
doi:10.7939/R3FP2J
License information
Creative Commons Attribution 3.0 Unported
Rights

Citation for previous publication

Source
Link to related item

File Details

Date Uploaded
Date Modified
2014-04-25T00:46:00.276+00:00
Audit Status
Audits have not yet been run on this file.
Characterization
File format: pdf (Portable Document Format)
Mime type: application/pdf
File size: 280428
Last modified: 2015:10:12 13:41:40-06:00
Filename: TR05-06.pdf
Original checksum: 72e6a070da70d3e4501f01b65621d155
Well formed: true
Valid: true
File title: Introduction
Page count: 15
Activity of users you follow
User Activity Date