Computer Science Faculty Research & Creative Works

MapReduce Based Parallel Suffix Tree Construction For Human Genome

Umesh Chandra Satish
Praveenkumar Kondikoppa
Seung Jong Park, Missouri University of Science and TechnologyFollow
Manish Patil
Rahul Shah

Abstract

Genome indexing is the basis for many bioinformatics applications. Read mapping (sequence alignment) is one such application where the goal is to align millions of short reads against reference genome. Several tools are available for read mapping which rely on different indexing techniques to expedite the alignment process. However, many of these contemporary alignment programs are sequential, memory intensive and cannot be easily scaled for larger genomes. Suffix tree is one of the most widely used data structures for indexing strings (genomes). Building a scalable suffix-tree based tool is particularly challenging due to the difficulties involved in parallel construction of the suffix tree. Several suffix tree construction techniques have been proposed till date with focus on space-time tradeoff. Most of these existing works address the construction issue for uniprocessor and cannot be easily extended to utilize modern multi-processor systems. In this paper we investigate and propose a MapReduce based parallel construction of suffix tree. We demonstrate the performance of the algorithm over commodity cluster using up to 32 nodes each having 8GB of primary memory.

Recommended Citation

U. C. Satish et al., "MapReduce Based Parallel Suffix Tree Construction For Human Genome," Proceedings of the International Conference on Parallel and Distributed Systems - ICPADS, pp. 664 - 670, article no. 7097867, Institute of Electrical and Electronics Engineers, Jan 2014.

The definitive version is available at https://doi.org/10.1109/PADSW.2014.7097867

Department(s)

Computer Science

Keywords and Phrases

genome; indexing; map-reduce; parallel; suffix tree

International Standard Book Number (ISBN)

978-147997615-7

International Standard Serial Number (ISSN)

1521-9097

Document Type

Article - Conference proceedings

Document Version

Citation

File Type

text

Language(s)

English

Rights

Publication Date

01 Jan 2014

Download

Full Text Link

Included in

Computer Sciences Commons

COinS

Computer Science Faculty Research & Creative Works

MapReduce Based Parallel Suffix Tree Construction For Human Genome

Abstract

Recommended Citation

Department(s)

Keywords and Phrases

International Standard Book Number (ISBN)

International Standard Serial Number (ISSN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Included in

Search

Browse

Author Corner

Related Content

Useful Links

Article Locations

Computer Science Faculty Research & Creative Works

MapReduce Based Parallel Suffix Tree Construction For Human Genome

Author

Abstract

Recommended Citation

Department(s)

Keywords and Phrases

International Standard Book Number (ISBN)

International Standard Serial Number (ISSN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Included in

Share

Search

Browse

Author Corner

Related Content

Useful Links

Article Locations