This is a decommissioned version of ERA which is running to enable completion of migration processes. All new collections and items and all edits to existing items should go to our new ERA instance at https://ualberta.scholaris.ca - Please contact us at erahelp@ualberta.ca for assistance!
- 220 views
- 229 downloads
Relational Databases for Querying Natural Language Text
-
- Author(s) / Creator(s)
-
Technical report TR07-08. With the vast amount of information stored in natural language text, sophisticated query engines are needed to pull data and effectively relate the pieces. While there has been a great deal of activity around semistructured data and in particular XML, there has not been much work on querying natural language text, despite the regularities that exist in natural language text. This paper explores a more conservative approach where natural language text is stored in a relational database. We present a framework for querying and integrating natural language text with relational data and investigate different strategies for optimizing queries. Our results show that the size of the plan space depends on the number of query terms and the overlap between query rewritings. Moreover, we show that the complexity of finding an optimal plan in the presence of rewritings is NP-hard. We develop a cost model and pruning techniques to reduce the size of the search space, and a polynomial-time greedy algorithm that finds a sub-optimal plan over a set of rewritings. Our experimental results indicate great savings in the evaluation costs of the optimized queries and that our greedy algorithm finds either an optimal plan or a plan that is very close to optimal in terms of cost. | TRID-ID TR07-08
-
- Date created
- 2007
-
- Subjects / Keywords
-
- Type of Item
- Report
-
- License
- Attribution 3.0 International