Download the full-sized PDF of Similarity Assessment of Data in Semantic WebDownload the full-sized PDF



Permanent link (DOI):


Export to: EndNote  |  Zotero  |  Mendeley


This file is in the following communities:

Graduate Studies and Research, Faculty of


This file is in the following collections:

Theses and Dissertations

Similarity Assessment of Data in Semantic Web Open Access


Other title
fuzzy set theory
linked data
resource description framework
entity matching
semantic web
information retrieval
Type of item
Degree grantor
University of Alberta
Author or creator
Dehleh Hossein Zadeh, Parisa
Supervisor and department
Marek Z. Reformat (Electrical and Computer Engineering)
Examining committee member and department
Chang-Shing Lee (National University of Tainan, Taiwan)
Witold Pedrycz (Electrical and Computer Engineering)
Di Niu (Electrical and Computer Engineering)
Marek Z. Reformat (Electrical and Computer Engineering)
Petr Musilek (Electrical and Computer Engineering)
Ken Wong (Computing Science)
Department of Electrical and Computer Engineering
Software Engineering and Intelligent Systems
Date accepted
Graduation date
Doctor of Philosophy
Degree level
The web is a constantly growing repository of information. Enormous amount of available information on the web creates a demand for automatic ways of processing and analyzing data. One of the most common activities performed by these processes is comparison of data – it is done to find something new or confirm things we already know. In each case there is a need for determining similarity between different objects and pieces of information. The process of determining similarity seems to be relatively easy when it is done for a numerical data, but it is not so in the case of a symbolic data. In order to make the data stored on the Internet more accessible, a new model of data representation has been introduced – Resource Description Framework. Linked data provides an open platform for representing and storing structured data as well as ontology. This aspect of data representation has been fully utilized for providing fundamentals for the new forms of Internet, Linked Data and Semantic Web. In this thesis, we investigate the problem of determining semantic similarity between entities in which not just lexical and syntactical information of entities are used, but the whole existing knowledge structure including the instantiated ontology is exploited. The idea is based on the fact that entities are interconnected and their semantics is defined via their connections to other entities as well as the metadata expressed as ontology. We propose feature-based methods for similarity assessment of concepts represented in ontology as well as in a less constrained Resource Description Framework. Membership functions are used to capture the importance of connections between entities at different hierarchy levels in ontology. We leverage importance weighted quantifier guided operator to aggregate the similarity values related to different groups of properties. In another proposed approach, we use concepts of possibility theory to determine lower and upper bounds of similarity intervals. In addition, we address contextual similarity assessment when only specific context is taken into consideration. The idea of ranking entities’ features according to their importance in describing an entity is introduced. We propose an approach that calculates similarly measures for these categories of features and then aggregates them using fuzzy-expressed weights that represents rankings of these categories. The promising results of our developed similarity method have encouraged us to extend it to a more comprehensive approach. As a result, we propose a technique for automatic identification of the importance of features and ranking them accordingly. Finally, we tackle the problem of application of heterogeneous feature types for defining entities. A method is described utilizing fuzzy set theory and linguistic aggregation to compare features of different types. We deploy this technique in a practical pharmaceutical application, where the proposed similarity assessment is shown to be capable of finding relevant entities – drugs in this case, in spite of heterogeneous features used to define them.
This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for the purpose of private, scholarly or scientific research. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.
Citation for previous publication
P. D. Hossein Zadeh and M. Z. Reformat (2015) The Web, Similarity, and Fuzziness. 50 Years of Fuzzy Logic and its Applications, Springer. volume 326, pp: 519 - 536 (published) R. R. Yager et al. (2013) Learning Techniques in the Presence of Uncertainty. Soft Computing: State of the Art Theory and Novel Applications. Springer. volume 291. pp: 129-143. (published) M. Z. Reformat and P. D. Hossein Zadeh (2012) Assimilation of Information in RDF-Based Knowledge Base. Advances in Computational Intelligence Communications in Computer and Information Science, Springer, volume 299. pp: 191-200. (published) P. D. Hossein Zadeh, M. D. Hossein Zadeh, M. Reformat, (2015) Feature-driven Linguistic-based Entity Matching in Linked Data with Application in Pharmacy, Soft Computing Journal, Springer, pp 1-16. (Published) P. D. Hossein Zadeh and M. Z. Reformat, (2013) Context-aware similarity assessment within semantic space formed in linked data,  Journal of Ambient Intelligence and Humanized Computing, volume 4, issue 4, pp. 515-532. (Published) P. D. Hossein Zadeh, M. Z. Reformat, (2013) Assessment of Semantic Similarity of Concepts Defined in Ontology, Journal of Information Sciences, Elsevier, volume 250, pp 21-39. (Published) P. D. Hossein Zadeh, M. Z. Reformat, (2012) Fuzzy Semantic Similarity in Linked Data using the OWA Operator, in Fuzzy Information Processing Society (NAFIPS), 2012 Annual Meeting of the North American, San Francisco, CA, pp.1-6. P. D. Hossein Zadeh and M. Z. Reformat. (2012) Feature-based Similarity Assessment in Ontology using Fuzzy Set Theory. IEEE International Conference on Fuzzy Systems (FUZZ-IEEE). Brisbane, Australia, June 10-15, pp: 1-7. P. D. Hossein Zadeh and M. Z. Reformat. (2013) Fuzzy Semantic Similarity in Linked Data using Wikipedia Infobox. IEEE IFSA World Congress and NAFIPS Annual Meeting (IFSA/NAFIPS) Joint, Edmonton, Canada, pp: 395-400. P. D. Hossein Zadeh and M. Z. Reformat, (2012) Assimilation of Information in Linked Data Knowledge Base, 14th International Conference on Information Processing and Management of Uncertainty in Knowledge-based Systems, IPMU'12, Catania, Italy, July 9-13. P. D. Hossein Zadeh and M. Z. Reformat, (2011) Learning Mechanisms and Uncertainty: Necessity or Trend, World Conference on Soft Computing, San Francisco, USA, May 23 - 26.

File Details

Date Uploaded
Date Modified
Audit Status
Audits have not yet been run on this file.
File format: pdf (PDF/A)
Mime type: application/pdf
File size: 3978100
Last modified: 2016:06:16 17:11:41-06:00
Filename: Dehlehhosseinzadeh_Parisa_201601_PhD.pdf
Original checksum: ca20e79fe90e3d51c42efcc0433ad7bf
Activity of users you follow
User Activity Date