In this paper, we present a mechanism for detecting and representing changes, given the old and new versions of a set of interlinked Web documents, retrieved in response to a user''s query. In particular, we show how to detect and represent Web deltas, i.e., changes in the Web documents that are relevant to a user''s query in the context of our Web warehousing system called WHOWEDA (Warehouse of Web Data). In WHOWEDA, Web information is materialized views stored in Web tables in the form of Web tuples. These Web tuples, represented as directed graphs, can be manipulated using a set of Web algebraic operators. In this paper, we present a mechanism to detect relevant Web deltas using Web algebraic operators such as the Web join and the outer Web join. Web join is used to detect identical documents residing in two Web tables, whereas, outer Web join, a derivative of Web join, is used to identify dangling Web tuples. We show how to represent these changes using delta Web tables. We develop formal algorithms for the generation of delta Web tables identifying Web documents which have been added, deleted, or modified since the last query.
S. K. Madria et al., "Detecting and Representing Relevant Web Deltas in WHOWEDA," IEEE Transactions on Knowledge and Data Engineering, Institute of Electrical and Electronics Engineers (IEEE), Jan 2003.
The definitive version is available at http://dx.doi.org/10.1109/TKDE.2003.1185843
Keywords and Phrases
Internet; WHOWEDA; Web Documents; Web Join; Web Warehouse; Web Warehousing; Data Warehouses; Delta Web Tables; Information Retrieval; Interlinked Web Documents; Query; Query Processing
International Standard Serial Number (ISSN)
Article - Journal
© 2003 Institute of Electrical and Electronics Engineers (IEEE), All rights reserved.