XML-SIM-CHANGE: Structure and Content Semantic Similarity Detection among XML Document Versions

Abstract

XML documents from different sources may represent the same or similar information with respect to content and structure. Being able to integrate similar XML documents is important to query systems and search engines. However, information changes periodically, therefore, it is important to detect the changes among different versions of an XML document and use the changed information to discover semantic similarity among XML documents. in this paper, we introduce such an approach to detect XML similarity using the change detection mechanism to join XML document versions. in our approach, keys in subtrees play an important role in order to avoid unnecessary comparisons of subtrees within different XML versions of the same document. We use relational database to store XML versions and apply SQL for detecting similarities. We show that our approach is highly scalable and has better efficiency in terms of execution time and provides comparable result quality. © 2010 Springer-Verlag.

Department(s)

Computer Science

Keywords and Phrases

Change Detection; Join; Keys; XML Similarity

International Standard Book Number (ISBN)

978-364216948-9

International Standard Serial Number (ISSN)

1611-3349; 0302-9743

Document Type

Article - Conference proceedings

Document Version

Citation

File Type

text

Language(s)

English

Rights

© 2024 Springer, All rights reserved.

Publication Date

16 Dec 2010

Share

 
COinS