An Efficient Data Processing Framework for Mining the Massive Trajectory of Moving Objects
Abstract
Recently, there has been increasing development of positioning technology, which enables us to collect large scale trajectory data for moving objects. Efficient processing and analysis of massive trajectory data has thus become an emerging and challenging task for both researchers and practitioners. Therefore, in this paper, we propose an efficient data processing framework for mining massive trajectory data. This framework includes three modules: (1) a data distribution module, (2) a data transformation module, and (3) a high performance I/O module. Specifically, we first design a two-step consistent hashing algorithm, which takes into account load balancing, data locality, and scalability, for a data distribution module. In the data transformation module, we present a parallel strategy of a linear referencing algorithm with reduced subtask coupling, easy-implemented parallelization, and low communication cost. Moreover, we propose a compression-aware I/O module to improve the processing efficiency. Finally, we conduct a comprehensive performance evaluation on a synthetic dataset (1.114 TB) and a real world taxi GPS dataset (578 GB). The experimental results demonstrate the advantages of our proposed framework.
Recommended Citation
Y. Zhou et al., "An Efficient Data Processing Framework for Mining the Massive Trajectory of Moving Objects," Computers, Environment and Urban Systems, vol. 61, pp. 129 - 140, Elsevier, Jan 2017.
The definitive version is available at https://doi.org/10.1016/j.compenvurbsys.2015.03.004
Department(s)
Computer Science
Keywords and Phrases
Data compression; Data handling; Global positioning system; Linear transformations; Mathematical transformations; Metadata; Network management; Taxicabs; Trajectories; Comprehensive performance evaluation; Consistent hashing; Consistent Hashing algorithms; Contribution model; Moving objects; Parallel linear referencing; Parallel strategies; Positioning technologies; Big data; Algorithm; Data mining; Data set; GPS; Trajectory; Compression contribution model; Trajectory of moving object; Two step consistent hashing
International Standard Serial Number (ISSN)
0198-9715
Document Type
Article - Journal
Document Version
Citation
File Type
text
Language(s)
English
Rights
© 2017 Elsevier, All rights reserved.
Publication Date
01 Jan 2017
Comments
This work is supported by Chinese "Twelfth Five-Year" Plan for Science & Technology Support under Grant Nos. 2012BAK17B01 and 2013BAD15B02, the Natural Science Foundation of China (NSFC) under Grant Nos. 91224006, 61003138 and 41371386, the Strategic Priority Research Program of the Chinese Academy of Sciences under Grant Nos. XDA06010202 and XDA05050601, the joint project by the Foshan and the Chinese Academy of Sciences under Grant No. 2012YS23.