LAST-HDFS: Location-Aware Storage Technique for Hadoop Distributed File System

Abstract

Enabled by the state-of-the-art cloud computing technologies, cloud storage has gained increasing popularity in recent years. Despite of the benefit of flexible and reliable data access offered by such services, users have to bear with the fact of not actually knowing the whereabouts of their data. The lack of knowledge and control of the physical locations of data could raise legal and regulatory issues, especially for certain sensitive data that are governed by laws to remain within certain geographic boundaries and borders. In this paper, we study the problem of data placement control within distributed file systems supporting cloud storage. Particularly, we consider the open source Hadoop file system (HDFS) as the underlying architecture, and propose a location-aware cloud storage system, named LAST-HDFS, to support and enforce location-aware storage in HDFS-based clusters. In addition, it also includes a monitoring system deployed at individual hosts to oversee and detect potential data placement violations due to the existence of malicious datanodes. We carried out an extensive experimental evaluation in a real cloud environment that demonstrates the effectiveness and efficiency of our proposed system.

Meeting Name

9th International Conference on Cloud Computing, CLOUD (2016: Jun. 27-Jul. 2, San Francisco, CA)

Department(s)

Computer Science

Research Center/Lab(s)

Intelligent Systems Center

Keywords and Phrases

Location-Awareness; MapReduce; HDFS

International Standard Book Number (ISBN)

978-1-5090-2619-7

International Standard Serial Number (ISSN)

2159-6190

Document Type

Article - Conference proceedings

Document Version

Citation

File Type

text

Language(s)

English

Rights

© 2016 Institute of Electrical and Electronics Engineers (IEEE), All rights reserved.

Publication Date

02 Jul 2016

Share

 
COinS