Power and Sample Size Estimation in High Dimensional Biology
Abstract
Genomic scientists often test thousands of hypotheses in a single experiment. One example is a microarray experiment that seeks to determine differential gene expression among experimental groups. Planning such experiments involves a determination of sample size that will allow meaningful interpretations. Traditional power analysis methods may not be well suited to this task when thousands of hypotheses are tested in a discovery oriented basic research. We introduce the concept of expected discovery rate (EDR) and an approach that combines parametric mixture modelling with parametric bootstrapping to estimate the sample size needed for a desired accuracy of results. While the examples included are derived from microarray studies, the methods, herein, are 'extraparadigmatic' in the approach to study design and are applicable to most high dimensional biological situations. Pilot data from three different microarray experiments are used to extrapolate EDR as well as the related false discovery rate at different sample sizes and thresholds.
Recommended Citation
G. L. Gadbury et al., "Power and Sample Size Estimation in High Dimensional Biology," Statistical Methods in Medical Research, SAGE Publications, Jan 2004.
Department(s)
Mathematics and Statistics
Keywords and Phrases
Genomic scientists; expected discovery rate (EDR); microarray; sample size
International Standard Serial Number (ISSN)
0962-2802
Document Type
Article - Journal
Document Version
Citation
File Type
text
Language(s)
English
Rights
© 2004 SAGE Publications, All rights reserved.
Publication Date
01 Jan 2004