Toward Interpretable Personas for Banking Customers

  • Author / Creator
    Md Monir Hossain
  • The financial landscape is in a state of major disruption through digitalization. The steady adoption of Artificial Intelligence (AI) has brought about a myriad of opportunities for banks and financial institutions to drive efficiency and innovation. These institutions are essentially customer-centric and therefore understanding their customer base is one of the major fields of interest from their perspective. Customer segmentation helps in this by breaking down customers into different groups based on different approaches. In most of the cases, traditional naive approaches like demographic features or specifically calculated financial values are used for this segmentation. The pitfalls of these approaches are the disregard for rich customer data these institutions collect, the introduction of bias, and missing out on the capture of micro-segments. This thesis presents a novel big data analytics framework to create interpretable personas for retail and business banking customers. These data-driven personas are essential to better tailor financial products and improve customer retention.

    In this thesis, we start with a comprehensive overview of big data analytics frameworks in finance, time series anomaly detection, customer segmentation, time series clustering, association rule mining, and distributed frameworks in big data analytics and Machine Learning (ML). Then we present the methodology that includes describing the retail and business customer dataset that we use in our experiments. The proposed framework is comprised of several components including pre-processing, anomaly detection, clustering of transaction time series, and mining association rules that map contextual data to cluster identifiers. We use anomaly detection for improving later stage clustering and find interesting properties for financial time series. We use different raw-data-based clustering techniques and compare them to find out the best methods based on internal evaluation metrics and cluster stability. We then use association rule analysis combining the contextual data with obtained clusters. Thus leveraging rich transaction and contextual data available from nearly 60,000 retail and 90,000 business customers of the financial institution, we empirically evaluate this framework and describe how the identified association rules
    can be used to explain and refine existing customer classes, and identify new customer classes and various data quality issues. We also analyze the performance of the proposed framework and explain its dynamic nature. We show that it can easily scale to millions of banking customers for both vertical and horizontal scalability.

  • Subjects / Keywords
  • Graduation date
    Spring 2021
  • Type of Item
  • Degree
    Master of Science
  • DOI
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.