Discovering Spatial Patterns using Statistically Significant Dependencies Open Access
- Other title
Spatial Data Mining
- Type of item
- Degree grantor
University of Alberta
- Author or creator
Mohomed Jabbar, Mohomed Shazan
- Supervisor and department
Zaiane, Osmar (Computing Science)
- Examining committee member and department
Nascimento, Mario (Computing Science)
Zaiane, Osmar (Computing Science)
Yasui, Yutaka (School of Public Health)
Department of Computing Science
- Date accepted
- Graduation date
Master of Science
- Degree level
Co-location pattern mining is a class of techniques to find associations among spatial features. It has a wide range of applications varying from business to science. Our work is motivated by an application in environmental health where the goal is to investigate whether the maternal exposure during pregnancy to air pollutants could be a potential cause to adverse birth outcomes. Discovering such relationships can be defined as finding spatial associations (i.e. co-location patterns) between adverse birth outcomes and air pollutant emissions. However, the increasing complexity of the application problems poses new challenges that traditional approaches are unable to address well. For instance, comparing and contrasting spatial groups is one such complex task posed as a research question in our application problem. Furthermore, traditional co-location pattern mining techniques heavily rely on frequency based thresholds which discard underrepresented rare patterns and find exaggerated noisy patterns which may not to be equally prevalent in unseen data. To address limitations in frequency based methods, some association studies propose to use statistical significance tests. The use of a spatial data transactionization mechanism helps exploiting such statistically significant association mining methods to find strong co-location patterns more efficiently. Towards this end we propose a novel approach, AGT-Fisher, to achieve the task of transactionization and using statistically significant dependency rules to find strong co-location patterns more efficiently. Our experiments reveal that the proposed AGT Fisher could indeed help in finding co-location patterns with a better statistical significance. Furthermore to compare spatial groups we introduce two new spatial patterns: spatial contrast sets and spatial common sets, and techniques based on AGT-Fisher to mine them efficiently. Our evaluation reveals that the contrast sets we found can successfully distinguish one group from the others. We also propose a new visualization framework, VizAR, to interactively visualize complex spatial patterns such as the ones we intend to discover. With the proposed methods and the VizAR tool, we discovered that air pollutants such as heavy metals, NO2, PM2.5, PM10 and TPM are frequently associated with adverse birth outcomes.
- This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for the purpose of private, scholarly or scientific research. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.
- Citation for previous publication
Jundong Li, Aibek Adilmagambetov, Mohomed Shazan Mohomed Jabbar, Osmar R. Zaiane, Alvaro Osornio-Vargas, and Osnat Wine. On discovering co-location patterns in datasets: a case study of pollutants and child cancers. GeoInformatica, 20(4):651–692, 2016Mohomed Shazan Mohomed Jabbar and Osmar R. Za¨ıane. Learning statistically significant contrast sets. In Proceedings of the 29th Canadian Conference on Artificial Intelligence (Canadian AI 2016), pages 237–242. Springer, 2016.
- Date Uploaded
- Date Modified
- Audit Status
- Audits have not yet been run on this file.
File format: pdf (PDF/A)
Mime type: application/pdf
File size: 2561484
Last modified: 2016:11:16 15:33:31-07:00
Original checksum: e1e3eb00d0ecd8493051ca1440e9189c
Activity of users you follow