Publishing Big Trajectory Data with Privacy Preservation
Department
Computer Science
Major
Computer Science
Research Advisor
Lin, Dan
Advisor's Department
Computer Science
Funding Source
National Science Foundation (NSF)
Abstract
One of the biggest trends in mobile technology is the collection of trajectory data for analysis and location prediction. While the collection of such data, through mobile phones and vehicle GPS systems, is not new, current research searches for better ways to preserve the privacy of the users, whose data is being collected. Over the past few years, several methods have been introduced including k-anonymity, data suppression, and data masking, however, all of these methods fail to address the huge amount of data being generated by an entire city of users. The amount of data being transmitted every month is in the order of exabytes. In this paper, we propose a new method, using Map Reduce technology, of anonymizing huge data so that individual users cannot be identified in published data while also keeping as much of the data as possible. With Map Reduce being easy to manage on multiple commodity machines and easy to configure to dynamically choose the number of machines for a given task, we believe this method has more scalability and will continue to outperform traditional methods even as the amount of data becomes even larger.
Biography
Katrina is a senior in computer science and has been accepted into the Computer Science PhD program with full scholarship. She has done prior research with Dr. Jennifer Leopold in 3D Spatial and Temporal Reasoning. Currently, she works under her PhD advisor in Big Data and Privacy Preservation and has been awarded an NSF Undergraduate Fellowship for her work.
Research Category
Sciences
Presentation Type
Oral Presentation
Document Type
Presentation
Award
Sciences oral presentation, Second place
Location
Carver Room
Presentation Date
16 Apr 2014, 10:00 am - 10:30 am
Publishing Big Trajectory Data with Privacy Preservation
Carver Room
One of the biggest trends in mobile technology is the collection of trajectory data for analysis and location prediction. While the collection of such data, through mobile phones and vehicle GPS systems, is not new, current research searches for better ways to preserve the privacy of the users, whose data is being collected. Over the past few years, several methods have been introduced including k-anonymity, data suppression, and data masking, however, all of these methods fail to address the huge amount of data being generated by an entire city of users. The amount of data being transmitted every month is in the order of exabytes. In this paper, we propose a new method, using Map Reduce technology, of anonymizing huge data so that individual users cannot be identified in published data while also keeping as much of the data as possible. With Map Reduce being easy to manage on multiple commodity machines and easy to configure to dynamically choose the number of machines for a given task, we believe this method has more scalability and will continue to outperform traditional methods even as the amount of data becomes even larger.