Issues about the global level arise whenever many hypotheses are tested at once. A common approach is to use Bonferroni correction, which has very low power when a large number of tests are performed.
Many current problems (e.g., gene discovery) involve a number of tests in the order of the tens or even hundreds of thousands. An alternative to Bonferroni is control of the False Discovery Rate, the expected proportion of false rejections over the number of discoveries, if any. The problem of extending FDR control to dependent test statistics is still somewhat open and partly confined to the case of positive dependence. Ad-hoc modifications to the standard procedure often lead to loss of power. We show that a certain degree of dependence, no matter its direction, is allowed among the test statistics without need for any modification to the standard procedures controlling the FDR.
We then introduce a new class of multiple testing procedures which can control the FDR and other functionals of the ratio of false rejections and discoveries at the user's choice. The class is based on the idea of generalized augmentation: at first hypotheses are rejected without any correction, then the set of rejections is adjusted by adding or removing a suitable number of rejected hypotheses. We focus on the probability of FDP exceeding a pre-specified threshold; referred to as False Discovery eXceedance, FDX. The new class is seen in simulations and real data applications to be much more powerful than the available methods in controlling the FDX, especially when the number of tests is large. A slight modification makes Type I error rate control valid under arbitrary dependence of the test statistics. Different applications are used to illustrate the use of FDP control in multiple testing, including: spatial disease mapping, gene discovery, image reconstruction.
Alessio Farcomeni is a statistician, working as assistant professor in Medical Statistics at Sapienza - University of Rome. His work focuses on robust statistics, longitudinal models, categorical data analysis, cluster analysis, and multiple testing. He also is involved in clinical, ecological, and econometric research.