Abstract

The term schema denotes whatever way a data model chooses to model its data. In this paper we discuss schemas of a set of HTML or XML documents retrieved from the Web in the context of our web warehousing system called WHOWEDA (Warehouse of Web Data). Web schemas are used to bind a web table that contains a collection of interlinked web documents called web tuples. These schemas specify some of the metadata, content and structural properties (in the form of predicates) shared by some of the Web documents and hyperlinks in the web table. They also summarize the hyperlink structure of these documents using the notion of connectivities. We show how web schemas are generated in WHOWEDA and discuss different types of operation that may be performed on web schemas.

Department(s)

Computer Science

Keywords and Phrases

Schema operations; Web schema; Web warehouse

Document Type

Article - Conference proceedings

Document Version

Citation

File Type

text

Language(s)

English

Rights

© 2024 Association for Computing Machinery, All rights reserved.

Publication Date

01 Nov 2000

Share

 
COinS