Sets and bags are closely related structures and have been studied in relational databases. A bag is different from a set in that it is sensitive to the number of times an element occurs, while a set is not. In this paper, we introduce the concept of a Web bag in the context of a World Wide Web warehouse called WHOWEDA (WareHouse Of WEb DAta) which we are currently building. Informally, a Web bag is a Web table which allows multiple occurrences of identical Web types. A Web bag helps one to discover useful knowledge from a Web table, such as visible documents or Web sites (i.e. documents/sites which can be reached by many paths), luminous documents (i.e. documents with many outgoing links) and luminous paths (i.e. frequently traversed paths). In this paper, we provide a cost-benefit analysis of materializing Web bags as compared to Web tables with distinct Web tuples
S. K. Madria et al., "Cost-benefit Analysis of Web Bag in a Web Warehouse," Proceedings of the International Symposium on Database Engineering and Applications, 1999, Institute of Electrical and Electronics Engineers (IEEE), Jan 1999.
The definitive version is available at http://dx.doi.org/10.1109/IDEAS.1999.787249
International Symposium on Database Engineering and Applications, 1999
Keywords and Phrases
WHOWEDA; Web Bags; Web Tables; Web Tuples; World Wide Web Warehouse; Cost-Benefit Analysis; Data Mining; Data Structures; Data Warehouses; Element Occurrence; Fan-In; Fan-Out; Frequently Traversed Paths; Identical Web Types; Information Resources; Luminous Documents; Luminous Paths; Outgoing Links; Search Engines; Useful Knowledge Discovery; Visible Web Sites; Visible Documents
Article - Conference proceedings
© 1999 Institute of Electrical and Electronics Engineers (IEEE), All rights reserved.