Keywords and Phrases

MicroRNAs

Abstract

"The need for automating genome analysis is a result of the tremendous amount of genomic data. As of today, a high-throughput DNA sequencing machine can run millions of sequencing reactions in parallel, and it is becoming faster and cheaper to sequence the entire genome of an organism. Public databases containing genomic data are growing exponentially, and hence the rise in demand for intuitive automated methods of DNA analysis and subsequent gene identification. However, the complexity of gene organization makes automation a challenging task, and smart algorithm design and parallelization are necessary to perform accurate analyses in reasonable amounts of time. This work describes two such automated methods for the identification of novel genes within given DNA sequences. The first method utilizes negative selection patterns as an evolutionary rationale for the identification of additional members of a gene family. As input it requires a known protein coding gene in that family. The second method is a massively parallel data mining algorithm that searches a whole genome for inverted repeats (palindromic sequences) and identifies potential precursors of non-coding RNA genes. Both methods were validated successfully on the fully sequenced and well studied plant species, Arabidopsis thaliana"--Abstract, page iv.

Advisor(s)

Erçal, Fikret
Frank, Ronald L.

Committee Member(s)

Leopold, Jennifer
Chellappan, Sriram
Madria, Sanjay Kumar

Department(s)

Computer Science

Degree Name

Ph. D. in Computer Science

Publisher

Missouri University of Science and Technology

Publication Date

Summer 2010

Journal article titles appearing in thesis/dissertation

Validation of an NSP-based (negative selection pattern) gene family identification strategy
Automation of an NSP-based (negative selection pattern) gene family identification strategy
Framework for automated enrichment of functionally significant inverted repeats in whole genomes

Pagination

ix, 63 pages

Note about bibliography

Includes bibliographical references.

Rights

Document Type

Dissertation - Open Access

File Type

text

Language

English

Subject Headings

DNA -- AnalysisGenes -- IdentificationRNA -- AnalysisSequence alignment (Bioinformatics)

Thesis Number

T 9659

Print OCLC #

692208267

Electronic OCLC #

752210699

Recommended Citation

Kandoth, Cyriac, "Computational methods for the discovery and analysis of genes and other functional DNA sequences" (2010). Doctoral Dissertations. 1903.
https://scholarsmine.mst.edu/doctoral_dissertations/1903

Download

Included in

Computer Sciences Commons

COinS

Doctoral Dissertations

Computational methods for the discovery and analysis of genes and other functional DNA sequences

Keywords and Phrases

Abstract

Advisor(s)

Committee Member(s)

Department(s)

Degree Name

Publisher

Publication Date

Journal article titles appearing in thesis/dissertation

Pagination

Note about bibliography

Rights

Document Type

File Type

Language

Subject Headings

Thesis Number

Print OCLC #

Electronic OCLC #

Recommended Citation

Included in

Search

Browse

Author Corner

Useful Links

Dissertation Locations

Doctoral Dissertations

Computational methods for the discovery and analysis of genes and other functional DNA sequences

Author

Keywords and Phrases

Abstract

Advisor(s)

Committee Member(s)

Department(s)

Degree Name

Publisher

Publication Date

Journal article titles appearing in thesis/dissertation

Pagination

Note about bibliography

Rights

Document Type

File Type

Language

Subject Headings

Thesis Number

Print OCLC #

Electronic OCLC #

Recommended Citation

Included in

Share

Search

Browse

Author Corner

Useful Links

Dissertation Locations