Download the full-sized PDF of Search Term Selection and Document Clustering for Query SuggestionDownload the full-sized PDF



Permanent link (DOI):


Export to: EndNote  |  Zotero  |  Mendeley


This file is in the following communities:

Graduate Studies and Research, Faculty of


This file is in the following collections:

Theses and Dissertations

Search Term Selection and Document Clustering for Query Suggestion Open Access


Other title
query suggestion, document clustering, query search, search term selection.
Type of item
Degree grantor
University of Alberta
Author or creator
Zhang, Xiaomin
Supervisor and department
Zilles, Sandra (Computer Science, University of Regina)
Holte, Robert (Computing Science)
Examining committee member and department
Zilles, Sandra (Compute Science, University of Regina)
Holte, Robert (Computing Science)
Zhao, Dangzhi (Library and Information Studies)
Goebel, Randy (Computing Science)
Department of Computing Science

Date accepted
Graduation date
Master of Science
Degree level
In order to improve a user's query and help the user quickly satisfy his/her information need, most search engines provide query suggestions that are meant to be relevant alternatives to the user's query. This thesis builds on the query suggestion system and evaluation methodology described in Shen Jiang's Masters thesis (2008). Jiang's system constructs query suggestions by searching for lexical aliases of web documents and then applying query search to the lexical aliases. A lexical alias for a web document is a list of terms that return the web document in a top-ranked position. Query search is a search process that finds useful combinations of search terms. The main focus of this thesis is to supply alternatives for the components of Jiang's system. We suggest three term scoring mechanisms and generalize Jiang's lexical alias search to be a general search for terms that are useful for constructing good query suggestions. We also replace Jiang's top-down query search by a bottom-up beam search method. We experimentally show that our query suggestion method improves Jiang's system by 30% for short queries and 90% for long queries using Jiang's evaluation method. In addition, we add new evidence supporting Jiang's conclusion that terms in the user's initial query terms are important to include in the query suggestions. In addition, we explore the usefulness of document clustering in creating query suggestions. Our experimental results are the opposite of what we expected: query suggestion based on clustering does not perform nearly as well, in terms of the "coverage" scores we are using for evaluation, as our best method that is not based on document clustering.
License granted by Xiaomin Zhang ( on 2010-12-23T20:37:36Z (GMT): Permission is hereby granted to the University of Alberta Libraries to reproduce single copies of this thesis and to lend or sell such copies for private, scholarly or scientific research purposes only. Where the thesis is converted to, or otherwise made available in digital form, the University of Alberta will advise potential users of the thesis of the above terms. The author reserves all other publication and other rights in association with the copyright in the thesis, and except as herein provided, neither the thesis nor any substantial portion thereof may be printed or otherwise reproduced in any material form whatsoever without the author's prior written permission.
Citation for previous publication

File Details

Date Uploaded
Date Modified
Audit Status
Audits have not yet been run on this file.
File format: pdf (Portable Document Format)
Mime type: application/pdf
File size: 1010067
Last modified: 2015:10:12 13:03:25-06:00
Filename: xiaomin_zhang_msc_thesis_after_defense.pdf
Original checksum: 4b16d60722f4ecd1c3cbd90c07f0ff70
Well formed: true
Valid: true
File title: xiaomin_zhang_msc_thesis.dvi
Page count: 86
Activity of users you follow
User Activity Date