Bio2X: A Rule-Based Approach for Semi-automatic Transformation of Semi-structured Biological Data to XML
Data integration of geographically dispersed, heterogeneous, complex biological databases is a key research area. One of the key features of a successful data integration system is to have a simple self-describing data exchange format. However, many of the biological databases provide data in flat files which are poor data exchange formats. Fortunately, XML can be viewed as a powerful data model and better data exchange format. In this paper, we present the Bio2X system that transforms flat file data into highly hierarchical XML data using rule-based machine learning technique. Bio2X has been fully implemented using Java. Our experiments to transform real world biological data demonstrate the effectiveness of the Bio2X approach.
S. Yang et al., "Bio2X: A Rule-Based Approach for Semi-automatic Transformation of Semi-structured Biological Data to XML," Data and Knowledge Engineering Journal, Elsevier, Feb 2005.
The definitive version is available at http://dx.doi.org/10.1016/j.datak.2004.05.008
Keywords and Phrases
Flat Files; Machine Learning; Rule Base; Transformer; XML
Article - Journal
© 2005 Elsevier, All rights reserved.