Epistemological Issues in Omics and High-Dimensional Biology: Give the People What They Want
Gene expression microarrays have been the vanguard of new analytic approaches in high-dimensional biology. Draft sequences of several genomes coupled with new technologies allow study of the influences and responses of entire genomes rather than isolated genes. This has opened a new realm of highly dimensional biology where questions involve multiplicity at unprecedented scales: thousands of genetic polymorphisms, gene expression levels, protein measurements, genetic sequences, or any combination of these and their interactions. Such situations demand creative approaches to the processes of inference, estimation, prediction, classification, and study design. Although bench scientists intuitively grasp the need for flexibility in the inferential process, the elaboration of formal supporting statistical frameworks is just at the very start. Here, we will discuss some of the unique statistical challenges facing investigators studying high-dimensional biology, describe some approaches being developed by statistical scientists, and offer an epistemological framework for the validation of proffered statistical procedures. A key theme will be the challenge in providing methods that a statistician judges to be sound and a biologist finds informative. The shift from family-wise error rate control to false discovery rate estimation and to assessment of ranking and other forms of stability will be portrayed as illustrative of approaches to this challenge.
T. Mehta et al., "Epistemological Issues in Omics and High-Dimensional Biology: Give the People What They Want," Physiological Genomics, American Physiological Society, Jan 2006.
The definitive version is available at http://dx.doi.org/10.1152/physiolgenomics.00095.2006
Mathematics and Statistics
Keywords and Phrases
Microarray Experiments; Statistical Genonics; Proteomics
Article - Journal
© 2006 American Physiological Society, All rights reserved.