Warehousing the Web or Building Virtual Web Views

  • Technical report TR01-06. We are so used to the ubiquitous World-Wide Web (WWW) that we take it for granted. There is no need to emphasize how dynamic, large, rich, and unstructured, yet important the Web is. From researchers and engineers to children and retired elderly, everyone uses the WWW for a variety of needs. A multitude of tools and search engines were developed to find and retrieve resources from the Web. However, everyone knows how frustrating the experience with search engines can be. It is very difficult to find, if ever found, relevant information or patterns from within resources on the Internet. The idea presented in this paper is to ``warehouse'' the Web in a structure that would allow efficient information retrieval and knowledge discovery from the Internet. Warehousing the Web in this context consists of creating different virtual web views with layered databases of descriptors organized hierarchicly. Using a declarative adhoc mining language, one can find and pinpoint explicit as well as implicit knowledge from the web warehouse. | TRID-ID TR01-06

