Distributed Query Scheduling in The Context of DIOM: An Experiment

  • Author(s) / Creator(s)
  • Technical report TR97-03. One of the key issues for query processing in distributed open environments is the query scheduling problem. Given a user query, after we know that there are n sources that are relevant to the answer of this query, the first issue we need to address is how to decompose the query into n subqueries, each targeting at one single source. The second issue is how to synchronize these n subqueries in the presence of inter-site joins. The third issue is how to package and assemble the results from n information sources according to the original query posed by the user. In this thesis, we discuss the first two issues in the context of DIOM, a distributed and interoperable query mediation system [L.Liu, Distributed and Parallel DB, Vol. 5, No. 2, 1997] Our main contribution is the systematic development of the two-tier distributed query scheduling framework that produces the relatively best query schedule according to the given combination of cost parameters, including the total query response time, the local query processing cost, and the communication cost. Our main focus is on queries that contain inter-site joins. The first tier is called the heuristic-driven query processing, which produces a heuristic-based optimal schedule. The second tier is referred to as the cost-driven query processing, which generates a cost-based optimal schedule. We implement a subset of our query scheduling algorithms in Java accessible from any Java-compliant GUI viewer such as Netscape 3.0. The URL for the demo is The most interesting features of our Java implementation is the functionality to allow users to trace the query scheduling process through trace program interface and the trace logs. | TRID-ID TR97-03

  • Date created
  • Subjects / Keywords
  • Type of Item
  • DOI
  • License
    Attribution 3.0 International