Similarity Assessment of Data in Semantic Web

Dehleh Hossein Zadeh, Parisa

doi:doi:10.7939/R3W08WT0G

View

Download

Communities and Collections

Graduate and Postdoctoral Studies (GPS), Faculty of / Theses and Dissertations

Usage

218 views
571 downloads

Similarity Assessment of Data in Semantic Web

Author / Creator

Dehleh Hossein Zadeh, Parisa
The web is a constantly growing repository of information. Enormous amount of available information on the web creates a demand for automatic ways of processing and analyzing data. One of the most common activities performed by these processes is comparison of data – it is done to find something new or confirm things we already know. In each case there is a need for determining similarity between different objects and pieces of information. The process of determining similarity seems to be relatively easy when it is done for a numerical data, but it is not so in the case of a symbolic data. In order to make the data stored on the Internet more accessible, a new model of data representation has been introduced – Resource Description Framework. Linked data provides an open platform for representing and storing structured data as well as ontology. This aspect of data representation has been fully utilized for providing fundamentals for the new forms of Internet, Linked Data and Semantic Web. In this thesis, we investigate the problem of determining semantic similarity between entities in which not just lexical and syntactical information of entities are used, but the whole existing knowledge structure including the instantiated ontology is exploited. The idea is based on the fact that entities are interconnected and their semantics is defined via their connections to other entities as well as the metadata expressed as ontology. We propose feature-based methods for similarity assessment of concepts represented in ontology as well as in a less constrained Resource Description Framework. Membership functions are used to capture the importance of connections between entities at different hierarchy levels in ontology. We leverage importance weighted quantifier guided operator to aggregate the similarity values related to different groups of properties. In another proposed approach, we use concepts of possibility theory to determine lower and upper bounds of similarity intervals. In addition, we address contextual similarity assessment when only specific context is taken into consideration. The idea of ranking entities’ features according to their importance in describing an entity is introduced. We propose an approach that calculates similarly measures for these categories of features and then aggregates them using fuzzy-expressed weights that represents rankings of these categories. The promising results of our developed similarity method have encouraged us to extend it to a more comprehensive approach. As a result, we propose a technique for automatic identification of the importance of features and ranking them accordingly. Finally, we tackle the problem of application of heterogeneous feature types for defining entities. A method is described utilizing fuzzy set theory and linguistic aggregation to compare features of different types. We deploy this technique in a practical pharmaceutical application, where the proposed similarity assessment is shown to be capable of finding relevant entities – drugs in this case, in spite of heterogeneous features used to define them.
Subjects / Keywords
Graduation date

Spring 2016
Type of Item

Thesis
Degree

Doctor of Philosophy
DOI

https://doi.org/10.7939/R3W08WT0G
License

This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.

Language

English
Institution

University of Alberta
Degree level

Doctoral
Department
- Department of Electrical and Computer Engineering
Specialization
- Software Engineering and Intelligent Systems
Supervisor / co-supervisor and their department(s)
- Marek Z. Reformat (Electrical and Computer Engineering)
Examining committee members and their departments
- Chang-Shing Lee (National University of Tainan, Taiwan)
- Ken Wong (Computing Science)
- Petr Musilek (Electrical and Computer Engineering)
- Marek Z. Reformat (Electrical and Computer Engineering)
- Witold Pedrycz (Electrical and Computer Engineering)
- Di Niu (Electrical and Computer Engineering)