Computer Science Faculty Research & Creative Works

Exhaustive RT-RICO Algorithm for Mining Association Rules in Protein Secondary Structure

Leong Lee
Jennifer Leopold, Missouri University of Science and TechnologyFollow
Ronald L. Frank, Missouri University of Science and TechnologyFollow

Abstract

Prediction of a protein's secondary structure from its amino acid sequence is a well studied computational problem in bioinformatics, and has significant practical research value. Although the secondary structure prediction problem was first defined almost fifty years ago, the accuracy of most modern methods still hovers around 80%. In [1] this research team presented a promising protein secondary structure prediction method, BLAST-RT-RICO (Relaxed Threshold Rule Induction from Coverings), that employs a modified association rule learning approach, utilizing multiple sequence alignment information. BLAST-RT-RICO achieved Q3 scores of 89.93% and 87.71% on the standard test datasets RS126 and CB396, respectively. However, there were some areas of the algorithm that were in need of improvement; most importantly, the time complexity for the rule generation step needed to be reduced. Recently, we developed a modified rule generation algorithm, ERT-RICO (Exhaustive Relaxed Threshold Rule Induction from Coverings), that addresses this issue. The research team now is able to run much larger test datasets with different choices of segment length and threshold value; preliminary test results achieved a Q3 score of 92.19% on the standard test dataset RS126. The modified algorithm, its mathematical definitions, and the improved time/space complexity are discussed in this paper.

Recommended Citation

L. Lee et al., "Exhaustive RT-RICO Algorithm for Mining Association Rules in Protein Secondary Structure," Proceedings of the 2012 IEEE Symposium on Computational Intelligence and Computational Biology (2012, San Diego, CA), pp. 260 - 266, Institute of Electrical and Electronics Engineers (IEEE), May 2012.

The definitive version is available at https://doi.org/10.1109/CIBCB.2012.6217239

Meeting Name

2012 IEEE Symposium on Computational Intelligence and Computational Biology, CIBCB 2012 (2012: May 9-12, San Diego, CA)

Department(s)

Computer Science

Second Department

Biological Sciences

Keywords and Phrases

Association Rule Mining; Data Mining; Protein Secondary Structure Prediction

International Standard Book Number (ISBN)

978-146731189-2

Document Type

Article - Conference proceedings

Document Version

Citation

File Type

text

Language(s)

English

Rights

Publication Date

01 May 2012

Link to Full Text

COinS

Computer Science Faculty Research & Creative Works

Exhaustive RT-RICO Algorithm for Mining Association Rules in Protein Secondary Structure

Abstract

Recommended Citation

Meeting Name

Department(s)

Second Department

Keywords and Phrases

International Standard Book Number (ISBN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Search

Browse

Faculty Gallery

Author Corner

Related Content

Useful Links

Article Locations

Computer Science Faculty Research & Creative Works

Exhaustive RT-RICO Algorithm for Mining Association Rules in Protein Secondary Structure

Author

Abstract

Recommended Citation

Meeting Name

Department(s)

Second Department

Keywords and Phrases

International Standard Book Number (ISBN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Share

Search

Browse

Faculty Gallery

Author Corner

Related Content

Useful Links

Article Locations