ERA

Download the full-sized PDF of Towards Understanding Latent Semantic IndexingDownload the full-sized PDF

Analytics

Share

Permanent link (DOI): https://doi.org/10.7939/R31N7XT9N

Download

Export to: EndNote  |  Zotero  |  Mendeley

Communities

This file is in the following communities:

Computing Science, Department of

Collections

This file is in the following collections:

Technical Reports (Computing Science)

Towards Understanding Latent Semantic Indexing Open Access

Descriptions

Author or creator
Cheng, Bin
Additional contributors
Subject/Keyword
Latent Semantic Indexing
lexical matching
Type of item
Computing Science Technical Report
Computing science technical report ID
TR03-03
Language
English
Place
Time
Description
Technical report TR03-03. The increasing amount of information available has made information retrieval tools become more and more important. Traditionally, these tools retrieve information by literally matching terms in the documents with the terms in the query. Unfortunately, because of synonymy and polysemy, the retrieval results of lexical matching approaches are sometimes incomplete and inaccurate. Conceptual-indexing techniques such as Latent Semantic Indexing (LSI) have been used to overcome the problems of lexical matching. The LSI model uses a statistical technique, singular value decomposition (SVD), to reveal the \"latent\" semantic structure and eliminate much of the \"noise\" (variability of word choice). Therefore, LSI is able to deal with the problems caused by synonymy and polysemy. Experiments show that LSI outperforms lexically matching methods on some well-known test document collections. In this essay, we develop a complete retrieval system based on the LSI model. The experimental results show that the system can retrieve documents effectively. We also use different parameters such as rank, similarity threshold and different term composition to test the retrieval system, so that we can choose an appropriate setting to get the best retrieval results. Furthermore, we apply different retrieval performance-enhancing techniques on the system. The experimental results demonstrate that relevance feedback and query expansion techniques yield significant improvement in the retrieval effectiveness of the system. We also exploit the folding-in method to append new documents and new index terms into the collection to save the time and effort required by frequent SVD recomputing.
Date created
2003
DOI
doi:10.7939/R31N7XT9N
License information
Creative Commons Attribution 3.0 Unported
Rights

Citation for previous publication

Source
Link to related item

File Details

Date Uploaded
Date Modified
2014-04-24T23:10:30.527+00:00
Audit Status
Audits have not yet been run on this file.
Characterization
File format: pdf (Portable Document Format)
Mime type: application/pdf
File size: 1151446
Last modified: 2015:10:12 21:27:57-06:00
Filename: TR03-03.pdf
Original checksum: c347b23233a1695ad474c26fcf722c32
Well formed: true
Valid: true
File title: H:\self2\Essays\BinCheng\final.prn.pdf
File author: stroulia
Page count: 56
Activity of users you follow
User Activity Date