bayesNMF: Fast Bayesian Poisson NMF with Automatically Learned Rank Applied to Mutational Signatures

bayesian computation

mutational signatures

genomics

Author

Jenna Landy

Published

February 25, 2025

Landy, Jenna M., Nishanth Basava, and Giovanni Parmigiani. “bayesNMF: Fast Bayesian Poisson NMF with Automatically Learned Rank Applied to Mutational Signatures.” arXiv preprint arXiv:2502.18674 (2025).

Bayesian Non-Negative Matrix Factorization (NMF) is a method of interest across fields including genomics, neuroscience, and audio and image processing. Bayesian Poisson NMF is of particular importance for counts data, for example in cancer mutational signatures analysis. However, MCMC methods for Bayesian Poisson NMF require a computationally intensive augmentation. Further, identifying latent rank is necessary, but commonly used heuristic approaches are slow and potentially subjective, and methods that learn rank automatically are unable to provide posterior uncertainties. In this paper, we introduce bayesNMF, a computationally efficient Gibbs sampler for Bayesian Poisson NMF. The desired Poisson-likelihood NMF is paired with a Normal-likelihood NMF used for high overlap proposal distributions in approximate Metropolis steps, avoiding augmentation. We additionally define Bayesian factor inclusion (BFI) and sparse Bayesian factor inclusion (SBFI) as methods to identify rank automatically while preserving posterior uncertainty quantification on the learned matrices. We provide an open-source R software package with all models and plotting capabilities demonstrated in this paper on GitHub at jennalandy/bayesNMF. While our applications focus on mutational signatures, our software and results can be extended to any use of Bayesian Poisson NMF

Advised by Giovanni Parmigiani, PhD
Department of Data Science, Dana Farber Cancer Institute
Department of Biostatistics, Harvard T.H. Chan School of Public Health