Performance Analysis of Fault Tolerant Multistage Interconnection Networked Parallel Instrumentation with Concurrent Testing and Diagnosis

Abstract

Performance and reliability are two of the most crucial issues in today's high-performance instrumentation and measurement systems. Instrumentation and measurement systems have found and enjoyed their performance enhancement through parallel and distributed processing. High speed and density Multistage Interconnection Networks (MINs) is a widely-used subsystem of parallel processing and communication systems. New performance models are proposed to evaluate the fault tolerant MIN in this paper, thereby establishing a sound foundation for assuring the performance and reliability of fault tolerant MINs with high confidence level during parallel instrumentation. A concurrent fault detection and recovery scheme for MINs is introduced to enable a generic approach to fault tolerance by rerouting over the redundant interconnection links. A switch architecture to realize the concurrent testing and diagnosis is shown. The proposed performance models are developed and used to evaluate the compound effect of the fault tolerant operations such as testing, diagnosis and recovery on the throughput and delay. Results are shown on single transient and permanent stuck-at faults on links and storage units in switching elements. It is shown that the performance degradation for the overhead due to the fault tolerance is quite graceful while the performance degradation without fault recovery is unacceptable.

Meeting Name

19th IEEE Instrumentation and Measurement Technology Conference: IMTC (2002: May 21-23, Anchorage, AK)

Department(s)

Electrical and Computer Engineering

Keywords and Phrases

Computer System Recovery; Data Communication Systems; Distributed Computer Systems; Failure Analysis; Fault Tolerant Computer Systems; Instrument Testing; Mathematical Models; Parallel Processing Systems; Performance; Telecommunication Links; Concurrent Fault Detection; Fault Detection; Multistage Interconnection Network; Parallel Instrumentation; Interconnection Networks; Diagnosis; Distributed Systems; Instrumentation; Parallel Processing; Performance Analysis

International Standard Book Number (ISBN)

780372182

International Standard Serial Number (ISSN)

1091-5281

Document Type

Article - Conference proceedings

Document Version

Citation

File Type

text

Language(s)

English

Rights

© 2002 Institute of Electrical and Electronics Engineers (IEEE), All rights reserved.

Publication Date

01 May 2002

Share

 
COinS