Perform a robust estimation of alpha coefficients by bootstrap to reach a certain level of cosine similarity given a set of observed counts x and discovered signatures beta.
signaturesSignificance(
x,
beta,
cosine_thr = 0.95,
min_contribution = 0.05,
pvalue_thr = 0.05,
nboot = 100,
num_processes = Inf,
verbose = TRUE
)
Counts matrix for a set of n patients and m categories. These can be, e.g., SBS, MNV, CN or CN counts; in the case of SBS it would be an n patients x 96 trinucleotides matrix.
Discovered signatures to be used for the fit of alpha.
Level of cosine similarity to be reached for the fit of alpha.
Minimum contribution of a signature to be considered significant.
Pvalue level to be used to assess significance.
Number of bootstrap iterations to be performed.
Number of processes to be used during parallel execution. To execute in single process mode, this parameter needs to be set to either NA or NULL.
Boolean. Shall I print information messages?
A list with the bootstrap estimates. It includes 5 elements: alpha: matrix of the discovered exposure values considering significant signatures as estimated by bootstrap. beta: matrix of the discovered signatures. unexplained_mutations: number of unexplained mutations per sample. goodness_fit: vector reporting cosine similarities between predictions and observations. bootstrap_estimates: list of matrices reporting results by bootstrap estimates.
data(background)
data(patients)
set.seed(12345)
beta <- signaturesDecomposition(x = patients[seq_len(3),seq_len(2)],
K = 3:4,
background_signature = background[seq_len(2)],
nmf_runs = 2,
num_processes = 1)
#> Performing signatures discovery and rank estimation...
#> Performing inference for K=3...
#> Performing NMF run 1 out of 2...
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Performing NMF run 2 out of 2...
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Performing inference for K=4...
#> Performing NMF run 1 out of 2...
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Performing NMF run 2 out of 2...
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
#> Warning: Option grouped=FALSE enforced in cv.glmnet, since < 3 observations per fold
set.seed(12345)
res <- signaturesSignificance(x = patients[seq_len(3),seq_len(2)],
beta = beta$beta[[1]],
cosine_thr = 0.95,
min_contribution = 0.05,
pvalue_thr = 0.05,
nboot = 5,
num_processes = 1)
#> Estimating the contribution of each signature to the fit with a total of 5 bootstrap iterations...
#> Performing iteration 1 out of 5...
#> Performing iteration 2 out of 5...
#> Performing iteration 3 out of 5...
#> Performing iteration 4 out of 5...
#> Performing iteration 5 out of 5...
#> Estimating level of significance for each signature...
#> Warning: cannot compute exact p-value with ties
#> Warning: cannot compute exact p-value with ties
#> Warning: cannot compute exact p-value with ties
#> Performing fit of alpha considering only signatures with significant contribution...