Download the full-sized PDF of Statistically Significant Dependencies for Spatial Co-location Pattern Mining and Classification Association Rule DiscoveryDownload the full-sized PDF



Permanent link (DOI):


Export to: EndNote  |  Zotero  |  Mendeley


This file is in the following communities:

Graduate Studies and Research, Faculty of


This file is in the following collections:

Theses and Dissertations

Statistically Significant Dependencies for Spatial Co-location Pattern Mining and Classification Association Rule Discovery Open Access


Other title
Statistically Significant Dependencies
Spatial Co-location Pattern Mining
Associative Classification
Type of item
Degree grantor
University of Alberta
Author or creator
Li, Jundong
Supervisor and department
Zaiane, Osmar (Computing Science)
Examining committee member and department
Zaiane, Osmar (Computing Science)
Sander, Joerg (Computing Science)
Musilek, Petr (Electrical and Computer Engineering)
Department of Computing Science

Date accepted
Graduation date
Master of Science
Degree level
Spatial co-location pattern mining and classification association rule discovery are two canonical tasks studied in the data mining community. Both of them focus on the detection of sets of features that show associations. The difference is that in spatial co-location pattern mining, the features are all spatial features which contain location information. While in classification association rule discovery, we constrain the mining process to generate association rules that always have as consequent a class label. Existing methods on these two tasks mostly use the support-confidence framework in an Apriori-like way or through a FP-growth approach to mine the co-location patterns and classification association rules which require the setting of confounding parameters. However, the lack of statistical dependencies between features in the used framework may lead to the omission of many interesting patterns and/or the detection of meaningless rules. To address the above limitations, we fully exploit the property of statistical significance and propose two novel algorithms for these two tasks, respectively. The CMCStatApriori, a co-location mining algorithm, is able to detect more general and statistically significant co-location rules. We use it on real datasets with the National Pollutant Release Inventory (NPRI), and propose a classification scheme to help evaluate the discovered co-location rules. The second algorithm, SigDirect, an associative classifier, aims to mine classification association rules which show statistically significant dependencies between a set of antecedent features and a class label. Experimental results on UCI datasets show that SigDirect achieves a competitive if not better classification performance while indeed produces a very small number of rules. We also show the potential of integrating statistically significant negative classification association rules in the SigDirect algorithm.
Permission is hereby granted to the University of Alberta Libraries to reproduce single copies of this thesis and to lend or sell such copies for private, scholarly or scientific research purposes only. Where the thesis is converted to, or otherwise made available in digital form, the University of Alberta will advise potential users of the thesis of these terms. The author reserves all other publication and other rights in association with the copyright in the thesis and, except as herein before provided, neither the thesis nor any substantial portion thereof may be printed or otherwise reproduced in any material form whatsoever without the author's prior written permission.
Citation for previous publication

File Details

Date Uploaded
Date Modified
Audit Status
Audits have not yet been run on this file.
File format: pdf (PDF/A)
Mime type: application/pdf
File size: 2618687
Last modified: 2015:10:22 06:25:38-06:00
Filename: Li_Jundong_201407_MSc.pdf
Original checksum: 4558cf1e9a17607a275cb4c862895cc8
Well formed: true
Valid: true
Page count: 81
Activity of users you follow
User Activity Date