Masters Theses

Author

Yan Chen

Keywords and Phrases

DiffXML (Computer file)

Abstract

"With the role of XML getting more and more important in Web applications, how to store XML files is getting much more concern than ever. Users are often not only interested in the current values of documents but also in changes, from which they can learn about the evolution of the web. In this thesis, a method to map XML files to relational data models and store them in a relational database is introduced. The method parses XML files as DOM trees and stores value and path information for each node in a relational table. The method takes linear time in storing different sizes of XML data. An algorithm 'DiffXML' is presented which uses SQL operations to detect changes between two versions of XML files which are stored in the database. The value and path information for XML files are used to detect differences. DiffXML finds new inserted nodes, deleted nodes, updated nodes, and also finds the move of an XML DOM subtree from one place to the other. The performance of the DiffXML algorithm is analyzed and compared with some current commercial XML change detection tools. In some of the cases, our method performs better than the existing methods"--Abstract, page iv.

Advisor(s)

Sanjay K. Madria

Committee Member(s)

Daniel C. St. Clair
Richard H. Hall

Department(s)

Computer Science

Degree Name

M.S. in Computer Science

Publisher

University of Missouri--Rolla

Publication Date

Summer 2003

Pagination

ix, 60 pages

Note about bibliography

includes bibliographical references (pages 58-59)

Rights

© 2003 Yan Chen, All rights reserved.

Document Type

Thesis - Restricted Access

File Type

text

Language

English

Subject Headings

XML (Document markup language)
Relational databases

Thesis Number

T 8333

Print OCLC #

54863699

Share My Thesis If you are the author of this work and would like to grant permission to make it openly accessible to all, please click the button above.

Share

 
COinS