Abstract

Logs record various operations and events during system running in text format, which is an essential basis for detecting and identifying potential security threats or system failures and is widely used in system management to ensure security and reliability. Existing log sequence anomaly detection is limited by log parsing and does not consider all key features of logs, which may cause false or missed detection. In this paper, we propose a fast and accurate log parsing method and feed the entire log content into the deep learning network for analysis. To avoid semantic loss during parsing, we replace some variables with tokens containing semantic information and divide logs with appropriate granularity. To ensure the speed and accuracy of parsing, we propose a similarity-based fast merging method to deal with redundant templates. For anomaly detection, we use the complete log content features as input to the model. We use Bidirectional Encoder Representation from Transformers (BERT) to output anomaly detection results directly after considering both the global and local information of log sequences. Experiments show that our log parsing method achieves the best average parsing quality on 16 datasets, and the anomaly detection method achieves optimal results on different datasets.

Department(s)

Computer Science

Comments

National Natural Science Foundation of China, Grant 61877005

Keywords and Phrases

Anomaly detection; Anomaly Detection; BERT; Bidirectional control; Deep learning; Encoding; Feature extraction; Log Parsing; Log Sequence; Long short term memory; Semantics; System Reliability

International Standard Serial Number (ISSN)

1941-0018; 1545-5971

Document Type

Article - Journal

Document Version

Citation

File Type

text

Language(s)

English

Rights

© 2024 Institute of Electrical and Electronics Engineers, All rights reserved.

Publication Date

01 Jan 2024

Share

 
COinS