Abstract
With continuous growth of IoT applications, service failures are quite inevitable. Due to the complexity and dynamics of IoT services, the root cause analysis (RCA) following an alert can assist in quickly resolving the possible faults. However, the time scales of metrics (e.g., CPU utilization, memory usage) generated by microservices and the dynamic topologies generated by calls between the Application Program Interfaces (APIs) are different. Moreover, the status of devices is an important aspect of RCA in IoT. All these make it extremely challenging to learn failure features of microservice metrics and API calls. Therefore, we propose a novel framework for collaborative identification of root cause analysis (CIRCA) to identify the most potential root cause path with the highest fault scores (weights). In detail, we use both microservice-level and API-level root cause identification (RCI) models to obtain the node fault score in the path. Since we prove the root cause path inference problem is an NP-hard problem, and we then propose a topology-based weighted variable neighborhood search (TWVNS) algorithm and infer the optimal root cause path from two-level scores and call topologies. Our experiments demonstrate CIRCA achieves satisfactory results of RCI and path inference on four public datasets.
Recommended Citation
X. Jiang et al., "CIRCA: A Framework for Collaborative Identification of Root Cause Analysis in Iot Microservices," IEEE Transactions on Services Computing, Institute of Electrical and Electronics Engineers; Computer Society, Jan 2025.
The definitive version is available at https://doi.org/10.1109/TSC.2025.3631804
Department(s)
Computer Science
Publication Status
Early Access
Keywords and Phrases
IoT Microservice Architecture; Path Inference; Root Cause Analysis; Root Cause Identification
International Standard Serial Number (ISSN)
1939-1374
Document Type
Article - Journal
Document Version
Citation
File Type
text
Language(s)
English
Rights
© 2025 Institute of Electrical and Electronics Engineers; Computer Society, All rights reserved.
Publication Date
01 Jan 2025
