Download the full-sized PDF of Error-tolerant Exemplar Queries on Knowledge GraphsDownload the full-sized PDF



Permanent link (DOI):


Export to: EndNote  |  Zotero  |  Mendeley


This file is in the following communities:

Graduate Studies and Research, Faculty of


This file is in the following collections:

Theses and Dissertations

Error-tolerant Exemplar Queries on Knowledge Graphs Open Access


Other title
Knowledge Graphs
Graph Edit Distance
Subgraph Search
Exemplar Queries
Type of item
Degree grantor
University of Alberta
Author or creator
Shao, Zhaoyang
Supervisor and department
Rafiei, Davood (Computing Science)
Examining committee member and department
Rafiei, Davood (Computing Science)
Stewart, Lorna (Computing Science)
Barbosa, Denilson (Computing Science)
Department of Computing Science

Date accepted
Graduation date
2017-06:Spring 2017
Master of Science
Degree level
Edge-labeled graphs are widely used to describe relationships between entities in a database. We study a class of queries on edge-labeled graphs, referred to as exemplar queries, where each query gives an example of what the user is searching for. Given an exemplar query, we study the problem of efficiently searching for similar subgraphs in a large data graph, where the similarity is defined in terms of the well-known graph edit distance. We call these queries error-tolerant exemplar queries since matches are allowed despite small variations in the graph structure and the labels. The problem in its general case is computationally intractable but efficient solutions are reachable for labeled graphs under well-behaved distribution of the labels, commonly found in knowledge graphs. In this thesis, we propose two efficient exact algorithms, based on a filtering-and-verification framework, for finding subgraphs in a large data graph that are isomorphic to a query graph under some edit operations. Our filtering scheme, which uses the neighbourhood structure around a node and the presence or absence of paths, significantly reduces the number of candidates that are passed to the verification stage. We analyze the costs of our algorithms and the conditions under which one algorithm is expected to outperform the other. Our cost analysis identifies some of the variables that affect the cost, including the number and the selectivity of the edge labels in the query and the degree of nodes in the data graph, and characterizes the relationships. We empirically evaluate the effectiveness of our filtering schemes and queries, the efficiency of our algorithms and the reliability of our cost models on real datasets.
This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for the purpose of private, scholarly or scientific research. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.
Citation for previous publication

File Details

Date Uploaded
Date Modified
Audit Status
Audits have not yet been run on this file.
File format: pdf (PDF/A)
Mime type: application/pdf
File size: 6361190
Last modified: 2017:06:13 12:17:39-06:00
Filename: Shao_Zhaoyang_201612_MSc.pdf
Original checksum: 297864132eec465ea0691db889362ad3
Well formed: true
Valid: true
Page count: 63
Activity of users you follow
User Activity Date