Abstract

In this paper, we present a mechanism for detecting and representing changes, given the old and new versions of a set of interlinked Web documents, retrieved in response to a user''s query. In particular, we show how to detect and represent Web deltas, i.e., changes in the Web documents that are relevant to a user''s query in the context of our Web warehousing system called WHOWEDA (Warehouse of Web Data). In WHOWEDA, Web information is materialized views stored in Web tables in the form of Web tuples. These Web tuples, represented as directed graphs, can be manipulated using a set of Web algebraic operators. In this paper, we present a mechanism to detect relevant Web deltas using Web algebraic operators such as the Web join and the outer Web join. Web join is used to detect identical documents residing in two Web tables, whereas, outer Web join, a derivative of Web join, is used to identify dangling Web tuples. We show how to represent these changes using delta Web tables. We develop formal algorithms for the generation of delta Web tables identifying Web documents which have been added, deleted, or modified since the last query.

Department(s)

Computer Science

Keywords and Phrases

Internet; WHOWEDA; Web Documents; Web Join; Web Warehouse; Web Warehousing; Data Warehouses; Delta Web Tables; Information Retrieval; Interlinked Web Documents; Query; Query Processing

International Standard Serial Number (ISSN)

1041-4347

Document Type

Article - Journal

Document Version

Final Version

File Type

text

Language(s)

English

Rights

© 2003 Institute of Electrical and Electronics Engineers (IEEE), All rights reserved.

Publication Date

01 Jan 2003

Share

 
COinS