Pi-Web Join in a Web Warehouse

Sanjay Kumar Madria, Missouri University of Science and Technology
Wee Keong Ng
Ee-Peng Lim
Sourav S. Bhowmick

With the enormous amount of data stored in the World Wide Web, it is increasingly important to design and develop powerful web warehousing tools. The key objective of our web warehousing project, called WHOWEDA (Warehouse of Web Data), is to design and implement a web warehouse that materializes and manages useful information from the web. We introduce the concept of Π-web join in the context of WHOWEDA. Pi-web join operator is a web information manipulation operator to combine relevant web information residing in two web tables. Informally, it is the combination of web join and web project operators which filter out irrelevant information from a joined web table. We show how to construct the Π-joined web table and its schema. We also highlight the benefits of the Pi-web join operator.