Perform a robust estimation of alpha coefficients by bootstrap to reach a certain level of cosine similarity given a set of observed counts x and discovered signatures beta.

signaturesSignificance(
  x,
  beta,
  cosine_thr = 0.95,
  min_contribution = 0.05,
  pvalue_thr = 0.05,
  nboot = 100,
  num_processes = Inf,
  verbose = TRUE
)

Arguments

x

Counts matrix for a set of n patients and m categories. These can be, e.g., SBS, MNV, CN or CN counts; in the case of SBS it would be an n patients x 96 trinucleotides matrix.

beta

Discovered signatures to be used for the fit of alpha.

cosine_thr

Level of cosine similarity to be reached for the fit of alpha.

min_contribution

Minimum contribution of a signature to be considered significant.

pvalue_thr

Pvalue level to be used to assess significance.

nboot

Number of bootstrap iterations to be performed.

num_processes

Number of processes to be used during parallel execution. To execute in single process mode, this parameter needs to be set to either NA or NULL.

verbose

Boolean. Shall I print information messages?

Value

A list with the bootstrap estimates. It includes 5 elements: alpha: matrix of the discovered exposure values considering significant signatures as estimated by bootstrap. beta: matrix of the discovered signatures. unexplained_mutations: number of unexplained mutations per sample. goodness_fit: vector reporting cosine similarities between predictions and observations. bootstrap_estimates: list of matrices reporting results by bootstrap estimates.

Examples

data(background)
data(patients)
set.seed(12345)
beta <- signaturesDecomposition(x = patients[seq_len(3),seq_len(2)],
                                K = 3:4,
                                background_signature = background[seq_len(2)],
                                nmf_runs = 2,
                                num_processes = 1)
#> Performing signatures discovery and rank estimation...
#> Performing inference for K=3...
#> Performing NMF run 1 out of 2...
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Performing NMF run 2 out of 2...
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Performing inference for K=4...
#> Performing NMF run 1 out of 2...
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Performing NMF run 2 out of 2...
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
set.seed(12345)
res <- signaturesSignificance(x = patients[seq_len(3),seq_len(2)],
                              beta = beta$beta[[1]],
                              cosine_thr = 0.95,
                              min_contribution = 0.05,
                              pvalue_thr = 0.05,
                              nboot = 5,
                              num_processes = 1)
#> Estimating the contribution of each signature to the fit with a total of 5 bootstrap iterations...
#> Performing iteration 1 out of 5...
#> Performing iteration 2 out of 5...
#> Performing iteration 3 out of 5...
#> Performing iteration 4 out of 5...
#> Performing iteration 5 out of 5...
#> Estimating level of significance for each signature...
#> Warning: cannot compute exact p-value with ties
#> Warning: cannot compute exact p-value with ties
#> Warning: cannot compute exact p-value with ties
#> Performing fit of alpha considering only signatures with significant contribution...