A Parallel Algorithm for Anonymizing Large-Scale Trajectory Data
With the proliferation of location-based services, the quantity of location data collected by service providers is gigantic. If these datasets could be published, they will be valuable assets to various sectors. However, there are two major concerns that considerably limit the availability and the usage of these trajectory datasets. The first is the threat to individual privacy. The other concern is the ability to analyze the exabytes of location data in a timely manner. Although there have been trajectory anonymization approaches proposed in the past to mitigate privacy concerns. None of these prior works address the scalability issue since it is a newly occurring problem brought by the significantly increasing adoption of location-based services. In this paper, we propose a novel parallel trajectory anonymization algorithm that achieves scalability, strong privacy protection and high utility rate of the anonymized trajectory datasets. We have conducted extensive experiments using MapReduce on real and synthectic datasets, and our results prove both effectiveness and efficiency when compared with the centralized approaches.
K. Ward et al., "A Parallel Algorithm for Anonymizing Large-Scale Trajectory Data," ACM Transactions on Data Science (TDS), vol. 1, no. 1, Association for Computing Machinery (ACM), Feb 2020.
The definitive version is available at https://doi.org/10.1145/3368639
Intelligent Systems Center
Second Research Center/Lab
Center for Research in Energy and Environment (CREE)
Third Research Center/Lab
Center for High Performance Computing Research
Article - Journal
© 2020 Association for Computing Machinery (ACM), All rights reserved.
01 Feb 2021
This work was funded by the National Science Foundation (Grant No. NSF-DGE-1914771) and Department of Education (Grant No. P200A120110).