Abstract
We consider the problem of identifying the source of failure in a network after receiving alarms or having observed symptoms. To locate the root cause accurately and timely in a large communication system is challenging because a single fault can often result in a large number of alarms, and multiple faults can occur concurrently. In this paper, we present a new fault localization method using a machine-learning approach. We propose to use logistic regression to study the correlation among network events based on end-to-end measurements. Then based on the regression model, we develop fault hypothesis that best explains the observed symptoms. Unlike previous work, the machine-learning algorithm requires neither the knowledge of dependencies among network events, nor the probabilities of faults, nor the conditional probabilities of fault propagation as input. The 'low requirement' feature makes it suitable for large complex networks where accurate dependencies and prior probabilities are difficult to obtain. We then evaluate the performance of the learning algorithm with respect to the accuracy of fault hypothesis and the concentration property. Experimental results and theoretical analysis both show satisfactory performance.
Recommended Citation
M. X. Cheng and W. B. Wu, "Data Analytics for Fault Localization in Complex Networks," IEEE Internet of Things Journal, vol. 3, no. 5, pp. 701 - 708, article no. 7336493, Institute of Electrical and Electronics Engineers, Oct 2016.
The definitive version is available at https://doi.org/10.1109/JIOT.2015.2503270
Department(s)
Computer Science
Keywords and Phrases
Complex networks; computer network reliability; fault diagnosis; fault location; logistic regression; machine learning
International Standard Serial Number (ISSN)
2327-4662
Document Type
Article - Journal
Document Version
Citation
File Type
text
Language(s)
English
Rights
© 2024 Institute of Electrical and Electronics Engineers, All rights reserved.
Publication Date
01 Oct 2016