Error-Controlled, Progressive, and Adaptable Retrieval of Scientific Data with Multilevel Decomposition
Extreme-scale simulations and high-resolution instruments have been generating an increasing amount of data, which poses significant challenges to not only data storage during the run, but also post-processing where data will be repeatedly retrieved and analyzed for a long period of time the challenges in satisfying a wide range of post-hoc analysis needs while minimizing the I/O overhead caused by inappropriate and/or excessive data retrieval should never be left unmanaged. In this paper, we propose a data refactoring, compressing, and retrieval framework capable of 1) fine-grained data refactoring with regard to precision; 2) incrementally retrieving and recomposing the data in terms of various error bounds; and 3) adaptively retrieving data in multi-precision and multi-resolution with respect to different analysis. With the progressive data re-composition and the adaptable retrieval algorithms, our framework significantly reduces the amount of data retrieved when multiple incremental precision are requested and/or the downstream analysis time when coarse resolution is used. Experiments show that the amount of data retrieved under the same progressively requested error bound using our framework is 64% less than that using state-of-The-Art single-error-bounded approaches. Parallel experiments with up to 1, 024 cores and 600 GB data in total show that our approach yields 1.36x and 2.52x performance over existing approaches in writing to and reading from persistent storage systems, respectively.
X. Liang et al., "Error-Controlled, Progressive, and Adaptable Retrieval of Scientific Data with Multilevel Decomposition," Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (2021, St. Louis, MO), Association for Computing Machinery (ACM), Nov 2021.
The definitive version is available at https://doi.org/10.1145/3458817.3476179
International Conference for High Performance Computing, Networking, Storage and Analysis, SC'21 (2021: Nov. 14-19, St. Louis, MO)
Keywords and Phrases
Data compression; data retrieval; error control; storage and I/O
International Standard Book Number (ISBN)
International Standard Serial Number (ISSN)
Article - Conference proceedings
© 2021 Association for Computing Machinery (ACM), All rights reserved.
19 Nov 2021