Abstract

Parallelism is key to efficiently utilizing high-speed research networks when transferring large volumes of data. However, the monolithic design of existing transfer applications requires the same level of parallelism to be used for reading, write, and network operations for file transfers. This, in turn, overburdens system resources since setting the parallelism level for the slowest component results in unnecessarily high parallelism for other components. Using more than necessary parallelism led to increased overhead on system resources and unfair resource allocation among competing transfers. In this paper, we introduce modular file transfer architecture, Marlin, to separate I/O and network operations for file transfers so that parallelism can be independently adjusted for each component. Marlin adopts online gradient descent algorithm to swiftly search the solution space and find the optimal level of parallelism for read, transfer, and write operations. Experimental results collected under various network settings show that Marlin can identify and use a minimum parallelism level for each component, improving fairness among competing transfers and CPU utilization. Finally, separating network transfers from write operations allows Marlin to outperform the state-of-the-art solutions by more than 2𝑥 when transferring small datasets.

Department(s)

Computer Science

Publication Status

Public Access

Keywords and Phrases

wide-area file transfers, i/o parallelism, online optimization, high performance networks

Document Type

Article - Conference proceedings

Document Version

Citation

File Type

text

Language(s)

English

Rights

© 2024 Association for Computing Machinery, All rights reserved.

Publication Date

21 Jun 2023

Share

 
COinS