MGARD+: Optimizing Multilevel Methods for Error-Bounded Scientific Data Reduction
Abstract
Data management is becoming increasingly important in dealing with the large amounts of data produced by todays large-scale scientific simulations and instruments. Existing multilevel compression algorithms offer a promising way to manage scientific data at scale, but may suffer from relatively low performance and reduction quality. In this paper, we propose MGARD+, a multilevel data reduction and refactoring framework drawing on previous multilevel methods, to achieve high-performance data decomposition and high-quality error-bounded lossy compression. Our contributions are four-fold: 1) We propose a level-wise coefficient quantization method, which uses different error tolerances to quantize the multilevel coefficients. 2) We propose an adaptive decomposition method which treats the multilevel decomposition as a preconditioner and terminates the decomposition process at an appropriate level. 3) We leverage a set of algorithmic optimization strategies to significantly improve the performance of multilevel decomposition/recomposition. 4) We evaluate our proposed method using four real-world scientific datasets and compare with several state-of-the-art lossy compressors. Experiments demonstrate that our optimizations improve the decomposition/recomposition performance of the existing multilevel method by up to 70X, and the proposed compression method can improve compression ratio by up to 2X compared with other state-of-the-art error-bounded lossy compressors under the same level of data distortion.
Recommended Citation
X. Liang et al., "MGARD+: Optimizing Multilevel Methods for Error-Bounded Scientific Data Reduction," IEEE Transactions on Computers, Institute of Electrical and Electronics Engineers (IEEE), Jan 2021.
The definitive version is available at https://doi.org/10.1109/TC.2021.3092201
Department(s)
Computer Science
Research Center/Lab(s)
Intelligent Systems Center
Publication Status
Early Access
Keywords and Phrases
Arrays; Compressors; Computers; Data Models; Distortion; Error Control; High-Performance Computing; Lossy Compression; Multilevel Decomposition; Optimization; Quantization (Signal); Scientific Data
International Standard Serial Number (ISSN)
0018-9340; 1557-9956
Document Type
Article - Journal
Document Version
Citation
File Type
text
Language(s)
English
Rights
© 2021 Institute of Electrical and Electronics Engineers (IEEE), All rights reserved.
Publication Date
01 Jan 2021
Comments
This research was supported by the Exascale Computing Project (ECP), Project Number: 17-SC-20-SC