ERA

Download the full-sized PDF of Relational Databases for Querying Natural Language TextDownload the full-sized PDF

Analytics

Share

Permanent link (DOI): https://doi.org/10.7939/R34F1MH80

Download

Export to: EndNote  |  Zotero  |  Mendeley

Communities

This file is in the following communities:

Computing Science, Department of

Collections

This file is in the following collections:

Technical Reports (Computing Science)

Relational Databases for Querying Natural Language Text Open Access

Descriptions

Author or creator
Chubak, Pirooz
Rafiei, Davood
Additional contributors
Subject/Keyword
natural language queries
relational databases
Type of item
Computing Science Technical Report
Computing science technical report ID
TR07-08
Language
English
Place
Time
Description
Technical report TR07-08. With the vast amount of information stored in natural language text, sophisticated query engines are needed to pull data and effectively relate the pieces. While there has been a great deal of activity around semistructured data and in particular XML, there has not been much work on querying natural language text, despite the regularities that exist in natural language text. This paper explores a more conservative approach where natural language text is stored in a relational database. We present a framework for querying and integrating natural language text with relational data and investigate different strategies for optimizing queries. Our results show that the size of the plan space depends on the number of query terms and the overlap between query rewritings. Moreover, we show that the complexity of finding an optimal plan in the presence of rewritings is NP-hard. We develop a cost model and pruning techniques to reduce the size of the search space, and a polynomial-time greedy algorithm that finds a sub-optimal plan over a set of rewritings. Our experimental results indicate great savings in the evaluation costs of the optimized queries and that our greedy algorithm finds either an optimal plan or a plan that is very close to optimal in terms of cost.
Date created
2007
DOI
doi:10.7939/R34F1MH80
License information
Creative Commons Attribution 3.0 Unported
Rights

Citation for previous publication

Source
Link to related item

File Details

Date Uploaded
Date Modified
2014-04-29T17:30:58.037+00:00
Audit Status
Audits have not yet been run on this file.
Characterization
File format: pdf (Portable Document Format)
Mime type: application/pdf
File size: 2693056
Last modified: 2015:10:12 13:20:43-06:00
Filename: TR07-08.pdf
Original checksum: aaaef148ae06bbf1ba9a31a3f95a6ced
Well formed: true
Valid: true
Page count: 13
Activity of users you follow
User Activity Date