Masters Theses

Author

Ajay Mane

Abstract

"Exhaustive searches for all members of a known gene family as well as the identification of new gene families have become increasingly important in the field of bioinformatics. The identification process consists of many steps that use software applications. However, there are no tools that provide comprehensive information on identifying gene families from ESTs alone. Manual intervention in the identification process is very time consuming because the genome data are huge.

This study has simplified the gene family identification process by automating the various steps and developing the applications SimESTs, PCAT, SCAT, and has significantly reduced the time taken for identifying gene families from ESTs. One can quickly identify the gene families in plants that exhibit a purifying selection between members. In addition, the applications have been used to test the quality of the clusters produced by existing clustering algorithms such as Unigene.

This study used the automated methods to 1) identify correctly three of the four PAL gene family members of Arabidopsis thaliana using ESTs from dbEST of NCBI, 2) apply the method on the CAD gene family of Glycine max and correctly identify two to six CAD gene family members that were previously unidentified, 3) identify two members of a putative Glycine max gene family previously unidentified in any plant species, and 4) test the quality of Unigene Glycine max clusters"--Abstract, page iii.

Advisor(s)

Erçal, Fikret
Frank, Ronald L.

Committee Member(s)

Leopold, Jennifer

Department(s)

Computer Science

Degree Name

M.S. in Computer Science

Publisher

University of Missouri--Rolla

Publication Date

Summer 2006

Pagination

ix, 45 pages

Note about bibliography

Includes bibliographical references (pages 43-44)

Rights

© 2006 Ajay Mane, All rights reserved.

Document Type

Thesis - Restricted Access

File Type

text

Language

English

Subject Headings

Arabidopsis -- GeneticsGenes -- IdentificationGenetic codeGenetics -- Data processing

Thesis Number

T 8921

Print OCLC #

79477523

Share My Thesis If you are the author of this work and would like to grant permission to make it openly accessible to all, please click the button above.

Share

 
COinS