Proteomic Pattern Recognition

  • Author(s) / Creator(s)
  • Technical report TR04-10. This report overviews the Mass Spectrometry Data Classification and Feature Extraction problem. After reviewing previous research new classification and feature extraction techniques are presented and empirically evaluated on three data sets. One of the key points made in this work, is that feature extraction techniques are composed of dimensionality reduction and feature selection methods. However, the two notions are quite different. The need for dimensionality reduction stems from the fact that classification algorithms cannot cope with the large number of input variables. On the other hand, feature selection techniques attempt to remove irrelevant and/or redundant features. Often classification algorithms cannot handle both a large number of variables and irrelevant variables that are not needed or even worse are misleading. In order to evaluate the dimensionality reduction and feature selection techniques, we use a simple classifier to evaluate performance. This makes the approach tractable. The experiments indicate that feature selection algorithms tend to both reduce data dimensionality and increase classification accuracy, while the studied dimensionality reduction technique sacrifices performance as a result of lowering the number of features a learning algorithm needs to deal with. | TRID-ID TR04-10

  • Date created
  • Subjects / Keywords
  • Type of Item
  • DOI
  • License
    Attribution 3.0 International