Power and Sample Size Estimation in High Dimensional Biology

Abstract

Genomic scientists often test thousands of hypotheses in a single experiment. One example is a microarray experiment that seeks to determine differential gene expression among experimental groups. Planning such experiments involves a determination of sample size that will allow meaningful interpretations. Traditional power analysis methods may not be well suited to this task when thousands of hypotheses are tested in a discovery oriented basic research. We introduce the concept of expected discovery rate (EDR) and an approach that combines parametric mixture modelling with parametric bootstrapping to estimate the sample size needed for a desired accuracy of results. While the examples included are derived from microarray studies, the methods, herein, are 'extraparadigmatic' in the approach to study design and are applicable to most high dimensional biological situations. Pilot data from three different microarray experiments are used to extrapolate EDR as well as the related false discovery rate at different sample sizes and thresholds.

Department(s)

Mathematics and Statistics

Keywords and Phrases

Genomic scientists; expected discovery rate (EDR); microarray; sample size

International Standard Serial Number (ISSN)

0962-2802

Document Type

Article - Journal

Document Version

Citation

File Type

text

Language(s)

English

Rights

© 2004 SAGE Publications, All rights reserved.

Publication Date

01 Jan 2004

Share

 
COinS