Classification and Sequential Pattern Mining From Uncertain Datasets

  • Author / Creator
    Hooshsadat, Metanat
  • Several research projects explore the application of uncertain databases which contain probabilistic attributes. Uncertainty in data can be caused by inherent randomness, imprecision in measuring equipment, ambiguity, information extraction from unstructured data, etc. The classification and Sequential Pattern Mining (SPM) of uncertain datasets both play a vital role in decision making systems and have recently attracted significant attention. In this study, we propose two novel algorithms for the aforementioned problems. Our novel associative classifier for uncertain data, UAC, has an effective rule pruning strategy. Using a general model for uncertainty, our experiments show that in many cases, UAC reaches higher accuracies than the existing algorithms. In SPM for uncertain data, other studies aimed to solve the problem for specific uncertainty models. We introduce UAprioriAll which conducts SPM from datasets with general attribute level uncertainty. Our experiments show that this method scales linearly when increasing the number of transactions.

  • Subjects / Keywords
  • Graduation date
  • Type of Item
  • Degree
    Master of Science
  • DOI
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.
  • Language
  • Institution
    University of Alberta
  • Degree level
  • Department
    • Department of Computing Science
  • Supervisor / co-supervisor and their department(s)
    • Zaiane, Osmar (Computing Science)
  • Examining committee members and their departments
    • Kurgan, Lukasz (Electrical and Computer Engineering)
    • Kondrak, Greg (Computing Science)