Masters Theses
Keywords and Phrases
DiffXML (Computer file)
Abstract
"With the role of XML getting more and more important in Web applications, how to store XML files is getting much more concern than ever. Users are often not only interested in the current values of documents but also in changes, from which they can learn about the evolution of the web. In this thesis, a method to map XML files to relational data models and store them in a relational database is introduced. The method parses XML files as DOM trees and stores value and path information for each node in a relational table. The method takes linear time in storing different sizes of XML data. An algorithm 'DiffXML' is presented which uses SQL operations to detect changes between two versions of XML files which are stored in the database. The value and path information for XML files are used to detect differences. DiffXML finds new inserted nodes, deleted nodes, updated nodes, and also finds the move of an XML DOM subtree from one place to the other. The performance of the DiffXML algorithm is analyzed and compared with some current commercial XML change detection tools. In some of the cases, our method performs better than the existing methods"--Abstract, page iv.
Advisor(s)
Sanjay K. Madria
Committee Member(s)
Daniel C. St. Clair
Richard H. Hall
Department(s)
Computer Science
Degree Name
M.S. in Computer Science
Publisher
University of Missouri--Rolla
Publication Date
Summer 2003
Pagination
ix, 60 pages
Note about bibliography
includes bibliographical references (pages 58-59)
Rights
© 2003 Yan Chen, All rights reserved.
Document Type
Thesis - Restricted Access
File Type
text
Language
English
Subject Headings
XML (Document markup language)Relational databases
Thesis Number
T 8333
Print OCLC #
54863699
Recommended Citation
Chen, Yan, "DiffXML: detecting changes in XML data" (2003). Masters Theses. 2395.
https://scholarsmine.mst.edu/masters_theses/2395
Share My Thesis If you are the author of this work and would like to grant permission to make it openly accessible to all, please click the button above.