Statistical Methods for Detection of Differential Methylation in Human Disease Studies

Presenter Information

Samuel Turpin

Department

Mathematics and Statistics

Major

Applied Mathematics, Emphasis in Statistics

Research Advisor

Olbricht, Gayla R.

Advisor's Department

Mathematics and Statistics

Funding Source

Missouri University of Science and Technology OURE Program

Abstract

DNA methylation is an epigenetic modification that occurs when a methyl group is added to cytosine sites on the DNA sequence. Altered DNA methylation patterns have been shown to be characteristic of various human diseases, including many types of cancer. With the advent of next-generation sequencing, DNA methylation can be measured in ways not possible just ten years ago. High-throughput sequencing technology such as Illumina’s HiSeq 2000 enables the quantification of the percent methylation at millions of cytosine locations. Statistical analysis of data from such studies allows researchers to determine which sites exhibit significant differences in their average methylation levels between normal and diseased groups. Using R, an open-source statistical analysis software package, each site can be tested to determine if a relationship exists between methylation level and disease status. Several statistical methods exist to test for differences between independent samples, such as the two-sample t test, which measures the likelihood that the true means of the two groups are the same. Other methods include Fisher’s exact test and the Wilcoxon rank-sum test. Sites in which the sample results indicate this likelihood is very small provide evidence for a significant difference in average methylation level between disease status groups. In this project, we examine and apply these three statistical methods to methylation data in a study comparing of senescent cells to normal cells with the goal of investigating the differences in these three analyses and ultimately obtaining a list of significant sites to test for methylation as an indicator of disease. Furthermore, these statistical methods are highly replicable and can be applied to the plethora of current data sets available on various archives to test for differentially methylated sites with any number of diseases and conditions. This is the first step to using methylation as a predictor for an innumerable set of characteristics.

Biography

Samuel is a senior majoring in applied mathematics. He is particularly interested in statistical studies and biological statistics. Sam is engaged to be married to Sarah Padgett in June and plans to continue studying statistics at KU in August. His hobbies include reading, card games, and video games.

Research Category

Sciences

Presentation Type

Poster Presentation

Document Type

Poster

Award

Sciences poster session, First place

Location

Upper Atrium/Hall

Presentation Date

16 Apr 2014, 9:00 am - 11:45 am

This document is currently not available here.

Share

COinS
 
Apr 16th, 9:00 AM Apr 16th, 11:45 AM

Statistical Methods for Detection of Differential Methylation in Human Disease Studies

Upper Atrium/Hall

DNA methylation is an epigenetic modification that occurs when a methyl group is added to cytosine sites on the DNA sequence. Altered DNA methylation patterns have been shown to be characteristic of various human diseases, including many types of cancer. With the advent of next-generation sequencing, DNA methylation can be measured in ways not possible just ten years ago. High-throughput sequencing technology such as Illumina’s HiSeq 2000 enables the quantification of the percent methylation at millions of cytosine locations. Statistical analysis of data from such studies allows researchers to determine which sites exhibit significant differences in their average methylation levels between normal and diseased groups. Using R, an open-source statistical analysis software package, each site can be tested to determine if a relationship exists between methylation level and disease status. Several statistical methods exist to test for differences between independent samples, such as the two-sample t test, which measures the likelihood that the true means of the two groups are the same. Other methods include Fisher’s exact test and the Wilcoxon rank-sum test. Sites in which the sample results indicate this likelihood is very small provide evidence for a significant difference in average methylation level between disease status groups. In this project, we examine and apply these three statistical methods to methylation data in a study comparing of senescent cells to normal cells with the goal of investigating the differences in these three analyses and ultimately obtaining a list of significant sites to test for methylation as an indicator of disease. Furthermore, these statistical methods are highly replicable and can be applied to the plethora of current data sets available on various archives to test for differentially methylated sites with any number of diseases and conditions. This is the first step to using methylation as a predictor for an innumerable set of characteristics.