A novel implementation of Replica Exchange Statistical Temperature Molecular Dynamics (RESTMD), belonging to a generalized ensemble method and also known as parallel tempering, is presented. Our implementation employs the MapReduce (MR)-based iterative framework for launching RESTMD over high performance computing (HPC) clusters including our test bed system, Cyber-infrastructure for Reconfigurable Optical Networks (CRON) simulating a network-connected distributed system. Our main contribution is a new implementation of STMD plugged into the well-known CHARMM molecular dynamics package as well as the RESTMD implementation powered by the Hadoop that scales out in a cluster and across distributed systems effectively. To address challenges for the use of Hadoop MapReduce, we examined contributing factors on the performance of the proposed framework with various runtime analysis experiments with two biological systems that differ in size and over different types of HPC resources. Many advantages with the use of RESTMD suggest its effectiveness for enhanced sampling, one of grand challenges in a variety of areas of studies ranging from chemical systems to statistical inference. Lastly, with its support for scale-across capacity over distributed computing infrastructure (DCI) and the use of Hadoop for coarse-grained task-level parallelism, MapReduce-based RESTMD represents truly a good example of the next-generation of applications whose provision is increasingly becoming demanded by science gateway projects, in particular, backed by IaaS clouds. © 2014 IEEE.


Computer Science

Keywords and Phrases

Distributed; MapReduce; RESTMD

International Standard Book Number (ISBN)


Document Type

Article - Conference proceedings

Document Version


File Type





© 2024 Institute of Electrical and Electronics Engineers, All rights reserved.

Publication Date

01 Jan 2014