M-Grid: A Distributed Framework for Multidimensional Indexing and Querying of Location based Data
Abstract
The widespread use of mobile devices and the real time availability of user-location information is facilitating the development of new personalized, location-based applications and services (LBSs). Such applications require multi-attribute query processing, scalability for supporting millions of users, real-time querying capability and analyzing large volumes of data. Cloud computing aided a new generation of distributed databases commonly known as key-value stores. Key-value stores were designed to extract values from very large volumes of data while being highly available, fault-tolerant and scalable, hence providing much needed infrastructure to support LBSs. However, complex queries over multidimensional data cannot be processed efficiently as they do not provide means to access multiple attributes. In this paper, we present M-Grid, a unifying indexing and a data distribution framework which enables key-value stores to support multidimensional queries. We organize a set of nodes in a modified P-Grid overlay network which provides efficient data distribution, fault-tolerance and query processing over multidimensional data. To index, we use Hilbert Space Filling Curve based linearization technique which preserves the data locality to efficiently manage multidimensional data in a key-value store. We propose algorithms to dynamically process range and k nearest neighbor (kNN) queries on linearized values. This removes the overhead of maintaining a separate index table. Our approach is completely independent from the underlying storage layer and can be implemented on any cloud infrastructure. Our experiments on Amazon EC2 show that M-Grid achieves a performance improvement of three orders of magnitude in comparison to MapReduce and four times to that of MD-HBase scheme.
Recommended Citation
S. Kumar et al., "M-Grid: A Distributed Framework for Multidimensional Indexing and Querying of Location based Data," Distributed and Parallel Database Journal, vol. 35, no. 1, pp. 55 - 81, Springer, Jan 2017.
The definitive version is available at https://doi.org/10.1007/s10619-017-7194-0
Department(s)
Computer Science
Research Center/Lab(s)
Intelligent Systems Center
Second Research Center/Lab
Center for High Performance Computing Research
International Standard Serial Number (ISSN)
0926-8782; 1573-7578
Document Type
Article - Journal
Document Version
Citation
File Type
text
Language(s)
English
Rights
© 2017 Springer, All rights reserved.
Publication Date
01 Jan 2017