Discovering Spatial Patterns using Statistically Significant Dependencies

  • Author / Creator
    Mohomed Jabbar, Mohomed Shazan
  • Co-location pattern mining is a class of techniques to find associations among spatial features. It has a wide range of applications varying from business to science. Our work is motivated by an application in environmental health where the goal is to investigate whether the maternal exposure during pregnancy to air pollutants could be a potential cause to adverse birth outcomes. Discovering such relationships can be defined as finding spatial associations (i.e. co-location patterns) between adverse birth outcomes and air pollutant emissions. However, the increasing complexity of the application problems poses new challenges that traditional approaches are unable to address well. For instance, comparing and contrasting spatial groups is one such complex task posed as a research question in our application problem. Furthermore, traditional co-location pattern mining techniques heavily rely on frequency based thresholds which discard underrepresented rare patterns and find exaggerated noisy patterns which may not to be equally prevalent in unseen data. To address limitations in frequency based methods, some association studies propose to use statistical significance tests. The use of a spatial data transactionization mechanism helps exploiting such statistically significant association mining methods to find strong co-location patterns more efficiently. Towards this end we propose a novel approach, AGT-Fisher, to achieve the task of transactionization and using statistically significant dependency rules to find strong co-location patterns more efficiently. Our experiments reveal that the proposed AGT Fisher could indeed help in finding co-location patterns with a better statistical significance. Furthermore to compare spatial groups we introduce two new spatial patterns: spatial contrast sets and spatial common sets, and techniques based on AGT-Fisher to mine them efficiently. Our evaluation reveals that the contrast sets we found can successfully distinguish one group from the others. We also propose a new visualization framework, VizAR, to interactively visualize complex spatial patterns such as the ones we intend to discover. With the proposed methods and the VizAR tool, we discovered that air pollutants such as heavy metals, NO2, PM2.5, PM10 and TPM are frequently associated with adverse birth outcomes.

  • Subjects / Keywords
  • Graduation date
    Fall 2016
  • Type of Item
  • Degree
    Master of Science
  • DOI
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.
  • Language
  • Institution
    University of Alberta
  • Degree level
  • Department
  • Supervisor / co-supervisor and their department(s)
  • Examining committee members and their departments
    • Nascimento, Mario (Computing Science)
    • Yasui, Yutaka (School of Public Health)
    • Zaiane, Osmar (Computing Science)