Advances in spaceborne vehicular technology have made possible the long-life duration of the mission in harsh cosmic environments. Reliability and data integrity are the commonly emphasized requirements of spaceborne solid-state mass storage systems, because faults due to the harsh cosmic environments, such as extreme radiation, can be experienced throughout the mission. Acceptable dependability for these instruments has been achieved by using redundancy and repair. Reconfiguration (repair) of memory arrays using spare memory lines is the most common technique for reliability enhancement of memories with faults. Faulty cells in memory arrays are known to show spatial locality. This physical phenomenon is referred to as fault clustering . This paper initially investigates a quadrat-based fault model for memory arrays under clustered faults to establish a reliable foundation of measurement. Then, lifelong dependability of a fault-tolerant spaceborne memory system with hierarchical active redundancy, which consists of spare columns in each memory module and redundant memory modules, is measured in terms of the reliability (i.e., the conditional probability that the system performs correctly throughout the mission) and mean-time-to-failure (i.e., the expected time that a system will operate before it fails). Finally, minimal column redundancy search technique for the fault-tolerant memory system is proposed and verified through a series of parametric simulations. Thereby, design and fabrication of cost-effective and highly reliable fault-tolerant onboard mass storage system can be realized for dependable instrumentation.
M. Choi et al., "Reliability Measurement of Mass Storage System for Onboard Instrumentation," IEEE Transactions on Instrumentation and Measurement, vol. 54, no. 6, pp. 2297-2304, Institute of Electrical and Electronics Engineers (IEEE), Dec 2005.
The definitive version is available at https://doi.org/10.1109/TIM.2005.858514
Electrical and Computer Engineering
Keywords and Phrases
Clustered Faults; Aerospace Instrumentation; Data Integrity; Fault Clustering; Fault Tolerant Computing; Fault-Tolerant Spaceborne Memory System; Faulty Cells; Harsh Cosmic Environments; Hierarchical Active Redundancy; Mass Storage System; Mean-Time-To-Failure (MTTF); Memory Array Reconfiguration; Memory Reconfiguration (Repair); Onboard Instrumentation; Onboard Mass Storage System; Quadrat-Based Fault Model; Redundancy; Redundancy Minimization; Reliability; Reliability Measurement; Spaceborne Vehicular Technology; Spare Memory Lines; Storage Management Chips
International Standard Serial Number (ISSN)
Article - Journal
© 2005 Institute of Electrical and Electronics Engineers (IEEE), All rights reserved.