Bio2X: A Rule-Based Approach for Semi-automatic Transformation of Semi-structured Biological Data to XML

Abstract

Data integration of geographically dispersed, heterogeneous, complex biological databases is a key research area. One of the key features of a successful data integration system is to have a simple self-describing data exchange format. However, many of the biological databases provide data in flat files which are poor data exchange formats. Fortunately, XML can be viewed as a powerful data model and better data exchange format. In this paper, we present the Bio2X system that transforms flat file data into highly hierarchical XML data using rule-based machine learning technique. Bio2X has been fully implemented using Java. Our experiments to transform real world biological data demonstrate the effectiveness of the Bio2X approach.

Department(s)

Computer Science

Keywords and Phrases

Flat Files; Machine Learning; Rule Base; Transformer; XML

Document Type

Article - Journal

Document Version

Citation

File Type

text

Language(s)

English

Rights

© 2005 Elsevier, All rights reserved.

Publication Date

01 Feb 2005

Share

 
COinS