Abstract
The term schema denotes whatever way a data model chooses to model its data. In this paper we discuss schemas of a set of HTML or XML documents retrieved from the Web in the context of our web warehousing system called WHOWEDA (Warehouse of Web Data). Web schemas are used to bind a web table that contains a collection of interlinked web documents called web tuples. These schemas specify some of the metadata, content and structural properties (in the form of predicates) shared by some of the Web documents and hyperlinks in the web table. They also summarize the hyperlink structure of these documents using the notion of connectivities. We show how web schemas are generated in WHOWEDA and discuss different types of operation that may be performed on web schemas.
Recommended Citation
S. S. Bhowmick et al., "Web Schemas in WHOWEDA," DOLAP: Proceedings of the ACM International Workshop on Data Warehousing and OLAP, pp. 17 - 24, Association for Computing Machinery, Nov 2000.
The definitive version is available at https://doi.org/10.1145/355068.355311
Department(s)
Computer Science
Keywords and Phrases
Schema operations; Web schema; Web warehouse
Document Type
Article - Conference proceedings
Document Version
Citation
File Type
text
Language(s)
English
Rights
© 2024 Association for Computing Machinery, All rights reserved.
Publication Date
01 Nov 2000