ERA

Download the full-sized PDF of FastLSA - A Fast Linear-Space Algorithm for Sequence AlignmentDownload the full-sized PDF

Analytics

Share

Permanent link (DOI): https://doi.org/10.7939/R3RF5KH38

Download

Export to: EndNote  |  Zotero  |  Mendeley

Communities

This file is in the following communities:

Computing Science, Department of

Collections

This file is in the following collections:

Technical Reports (Computing Science)

FastLSA - A Fast Linear-Space Algorithm for Sequence Alignment Open Access

Descriptions

Author or creator
Charter, K.
Driga, A.
Lu, Paul
Schaeffer, Jonathan
Szafron, Duane
Parsons, I.
Additional contributors
Subject/Keyword
Sequence Alignment
Bioinformatics
Type of item
Computing Science Technical Report
Computing science technical report ID
TR01-10
Language
English
Place
Time
Description
Technical report TR01-10. For two DNA or protein sequences of length m and n, dynamic programming alignment algorithms like Needleman-Wunsch and Smith-Waterman take O(m x n) time and use O(m x n) space, so we refer to them as full matrix (FM) algorithms. This space requirement means that large sequences will not completely fit in main memory. It also means that shorter sequences will not completely reside in cache memory. Hirschberg's algorithm reduces the space requirements to O(min(m, n)), but requires approximately twice the number of operations required by the FM algorithms. This paper presents the FastLSA algorithm that is adaptive to the amount of space available. It allows a user to trade space for operations. At one extreme, it uses linear space with approximately 1.5 times the number of operations required by the FM algorithms. At the other extreme, it uses quadratic space with no extra operations compared to the FM algorithms. However, our experiments show that in practice, due to memory caching effects, FastLSA and Hirschberg's algorithm are often faster than FM. This is true even though the number of operations is higher and even when there is enough main memory to hold the entire dynamic programming matrix of the FM algorithm. FastLSA has been incorporated into the newest and fastest commercialproducts like BioTools' ChromaTool. This paper also briefly describes how to parallelize the FastLSA algorithm to further improve its performance.
Date created
2001
DOI
doi:10.7939/R3RF5KH38
License information
Creative Commons Attribution 3.0 Unported
Rights

Citation for previous publication

Source
Link to related item

File Details

Date Uploaded
Date Modified
2014-04-24T23:39:39.471+00:00
Audit Status
Audits have not yet been run on this file.
Characterization
File format: pdf (Portable Document Format)
Mime type: application/pdf
File size: 392021
Last modified: 2015:10:12 16:48:56-06:00
Filename: TR01-10.pdf
Original checksum: 3c6c0d840f7056aa228741dbfe29fca1
Well formed: true
Valid: true
File title: Microsoft Word - BioInformatics2001-16.doc
File author: Duane Szafron
Page count: 26
Activity of users you follow
User Activity Date