Significantly Improving Lossy Compression Quality based on an Optimized Hybrid Prediction Model
Abstract
With the ever-increasing volumes of data produced by today's large-scale scientific simulations, error-bounded lossy compression techniques have become critical: not only can they significantly reduce the data size but they also can retain high data fidelity for postanalysis. In this paper, we design a strategy to improve the compression quality significantly based on an optimized, hybrid prediction model. Our contribution is fourfold. (1) We propose a novel, transform-based predictor and optimize its compression quality. (2) We significantly improve the coefficient-encoding efficiency for the data-fitting predictor. (3) We propose an adaptive framework that can select the best-fit predictor accurately for different datasets. (4) We evaluate our solution and several existing state-of-the-art lossy compressors by running real-world applications on a supercomputer with 8,192 cores. Experiments show that our adaptive compressor can improve the compression ratio by 112∼165% compared with the second-best compressor. The parallel I/O performance is improved by about 100% because of the significantly reduced data size. The total I/O time is reduced by up to 60X with our compressor compared with the original I/O time.
Recommended Citation
X. Liang et al., "Significantly Improving Lossy Compression Quality based on an Optimized Hybrid Prediction Model," International Conference for High Performance Computing, Networking, Storage and Analysis (2019, Denver, CO), Association for Computing Machinery (ACM), Nov 2019.
The definitive version is available at https://doi.org/10.1145/3295500.3356193
Meeting Name
International Conference for High Performance Computing, Networking, Storage and Analysis, SC '19 (2019: Nov. 17-19, Denver, CO)
Department(s)
Computer Science
Keywords and Phrases
Compression Performance; Data Dumping/Loading; Error-Bounded Lossy Compression; Rate Distortion
International Standard Book Number (ISBN)
978-145036229-0
International Standard Serial Number (ISSN)
2167-4329; 2167-4337
Document Type
Article - Conference proceedings
Document Version
Citation
File Type
text
Language(s)
English
Rights
© 2019 Association for Computing Machinery (ACM), All rights reserved.
Publication Date
17 Nov 2019
Comments
This research was supported by the Exascale Computing Project (ECP), Project Number: 17-SC-20-SC