Gridsemble: Selective Ensembling for False Discovery Rates

multiple hypothesis testing
genomics
Author

Jenna Landy

Published

January 23, 2024

Landy, Jenna M., and Parmigiani, Giovanni. “Gridsemble: Selective Ensembling for False Discovery Rates.” arXiv preprint arXiv:2401.12865 (2024).

Gridsemble is a data-driven selective ensembling algorithm for estimating local (fdr) and tail-end (Fdr) false discovery rates in large-scale multiple hypothesis testing. Existing methods for estimating fdr often yield different conclusions, yet the unobservable nature of fdr values prevents the use of traditional model selection. Our method circumvents this challenge by ensembling a subset of methods with weights based on their estimated performances, which are computed on synthetic datasets generated to mimic the observed data while including ground truth. This paper is on arXiv and is currently under review. The corresponding R software package is on GitHub.


Advised by Giovanni Parmigiani, PhD
Department of Data Science, Dana Farber Cancer Institute
Department of Biostatistics, Harvard T.H. Chan School of Public Health