Usage
  • 164 views
  • 277 downloads

Algorithms of Advanced Fuzzy Clustering: Development, Analysis, and Application to Rule-Based Modeling

  • Author / Creator
    Shen, Yinghua
  • Fuzzy clustering is one of the most significant techniques used to explore the structure of data. With the development of information technology and new requirements arising from data analysis, new challenges have been raised for fuzzy clustering algorithms due to the emerging characteristics of the data, e.g., the data could be distributed, granular, big, or partially supervised by domain experts or data analysts. The effectiveness, efficiency, and even feasibility of the commonly encountered fuzzy clustering algorithms, e.g., Fuzzy C-Means (FCM), could no longer be guaranteed. This severe situation invokes the urgent needs for the more advanced clustering algorithms. Hence, in this dissertation, our main objective is to develop and analyze a series of fuzzy clustering algorithms to address the general issues mentioned above. Besides, since clustering plays a significant role in constructing the fuzzy rule-based model (FRBM), several of those proposed clustering algorithms are used to either expand the application scenarios of the FRBM or improve the performance of the FRBM. Identifying the major characteristics of data encountered nowadays, proposing the corresponding novel data structure exploration solutions, and applying these novel solutions to system identification, constitute the major originality of this dissertation. 
    The methods used to realize our main objective are briefly introduced as follows. To cluster distributed data, the horizontal collaborative fuzzy clustering (HCFC) algorithm is refined. Specifically, a granular structure is formed as the global representative of all the distributed data. To cluster homogenous granular data, we propose a comprehensive framework to unify the processes of information granule formation, granular data clustering, and clustering results evaluation. To cluster heterogeneous granular data, we propose the approximation methods such that information granules could be transformed into the same form; afterwards, homogeneous granular clustering could be directly used. To cluster big data, we propose a hyperplane division-based method to get the subsets of the original data; then different clustering strategies are provided when different clustering requirements are sought (e.g., a large number of clusters is pursued). To make use of the knowledge (which is provided by domain experts or data analysts) about the data during the clustering process, we form two implementation methods of the knowledge tidbits. Furthermore, we specifically focus on applying two refined clustering algorithms to improving the FRBM. By using the HCFC algorithm, we make it possible to build the FRBM when input and output data are not allowed to be gathered together considering the data privacy. By using the supervision hints (knowledge tidbits) derived from the output space, we conduct a supervised clustering of the input space to improve the performance of FRBM in terms of the root-mean-square error (RMSE). Experiments on both synthetic and publicly available data are used to examine the effectiveness, efficiency, and feasibility of the proposed methods.
    

  • Subjects / Keywords
  • Graduation date
    Fall 2019
  • Type of Item
    Thesis
  • Degree
    Doctor of Philosophy
  • DOI
    https://doi.org/10.7939/r3-mcz1-6356
  • License
    Permission is hereby granted to the University of Alberta Libraries to reproduce single copies of this thesis and to lend or sell such copies for private, scholarly or scientific research purposes only. Where the thesis is converted to, or otherwise made available in digital form, the University of Alberta will advise potential users of the thesis of these terms. The author reserves all other publication and other rights in association with the copyright in the thesis and, except as herein before provided, neither the thesis nor any substantial portion thereof may be printed or otherwise reproduced in any material form whatsoever without the author's prior written permission.