A P2P Integration Architecture for Protein Resources

K. T. Claypool
Sanjay Kumar Madria, Missouri University of Science and Technology

The availability of a direct pathway from a primary sequence (denovo or DNA derived) to macromolecular structure to biological function using computer-based tools is the ultimate goal for a protein scientist. Today's state of the art protein resources and on-going research and experiments provide the raw data that can enable protein scientists to achieve at least some steps of this goal. Thus, protein scientists are looking towards taking their benchtop research from the specific to a much broader base of using the large resources of available electronic information. However, currently the burden falls on the scientist to manually interface with each data resource, integrate the required information, and then finally interpret the results. Their discoveries are impeded by the lack of tools that can not only bring integrated information from several known data resources, but also weave in information as it is discovered and brought online by other research groups. We propose a novel peer-to-peer based architecture that allows protein scientists to share resources in the form of data and tools within their community, facilitating ad hoc, decentralized sharing of data. In this paper, we present an overview of this integration architecture and briefly describe the tools that are essential to this framework.