The Robustness of Resource Allocation in Parallel and Distributed Computing Systems

Shoukat Ali, Missouri University of Science and Technology
Howard Jay Siegel
A. A. Maciejewski

This document has been relocated to http://scholarsmine.mst.edu/ele_comeng_facwork/1305

There were 21 downloads as of 27 Jun 2016.

Abstract

This paper gives an overview of the material to be discussed in the invited keynote presentation by H. J. Siegel. Performing computing and communication tasks on parallel and distributed systems involves the coordinated use of different types of machines, networks, interfaces, and other resources. Decisions about how best to allocate resources are often based on estimated values of task and system parameters, due to uncertainties in the system environment. An important research problem is the development of resource management strategies that can guarantee a particular system performance given such uncertainties. We have designed a methodology for deriving the degree of robustness of a resource allocation - the maximum amount of collective uncertainty in system parameters within which a user-specified level of system performance (QoS) can be guaranteed. Our four-step procedure for deriving a robustness metric for an arbitrary system will be presented. We will illustrate this procedure and its usefulness by deriving robustness metrics for some example distributed systems.