asceticCCFResampling — asceticCCFResampling • ASCETIC

Perform the ASCETIC inference framework on single samples (using CCF) datasets with re-sampling for a robust estimation of the agony ranking.

asceticCCFResampling(
  dataset,
  ccfDataset,
  vafDataset,
  nsampling = 100,
  regularization = c("aic", "bic"),
  command = "hc",
  restarts = 10
)

Arguments

dataset: Binary matrix where rows are samples and columns are mutations. Each cell of the matrix is 1 if the related mutation was observed in the sample; 0 otherwise. Values reported in the variable named dataset must be consistent with the ones reported in ccfDataset and vafDataset.
ccfDataset: Matrix where rows are samples and columns are mutations. Each cell of the matrix is the cancer cell fraction (CCF) estimated for the related mutation when observed in the sample. The CCF value is 0 if the mutation was not observed in the sample. Values reported in ccfDataset must be consistent with the ones reported in dataset and vafDataset.
vafDataset: R data.frame with 8 columns: 1) SAMPLE_ID, sample name. 2) GENE_ID, gene name. 3) REF_COUNT, total counts for reference allele. 4) ALT_COUNT, total counts for alternate allele. 5) COPY_NUMBER, total copy number estimate. 6) NORMAL_PLOIDY, ploidy for normal sample; this is either 1 for mutations on sex chromosomes or 2. 7) VAF_ESTIMATE, variant allele frequency (VAF) estimate. 8) CCF_ESTIMATE, cancer cell fraction (CCF) estimate. Values reported in vafDataset must be consistent with the ones reported in dataset and ccfDataset.
nsampling: Number of re-sampling to be performed for a robust estimation of the agony ranking. Higher values lead to improved estimates, but require higher computational burden; default value is 100.
regularization: Regularization to be used for the maximum likelihood estimation. Possible values are aic for the Akaike information criterion and bic for the Bayesian information criterion. For the complete list of options, we refer to the manual of the bnlearn package.
command: Optimization technique to be used for maximum likelihood estimation. Valid values are either hc for Hill Climbing or tabu for Tabu Search.
restarts: Number of restarts to be performed during the maximum likelihood estimation when Hill Climbing optimization technique is used. Higher values lead to improved estimates, but require higher computational burden; default value is 10. This parameter is ignored if tabu search is selected.

Value

A list of 5 elements: 1) dataset, input dataset. 2) ccfDataset, input ccfDataset. 3) rankingEstimate, ranking among mutations estimated by agony. Lower rankings correspond to early mutations. 4) poset, partially order set among mutations estimated by ASCETIC from the agony ranking. 5) inference, inferred ASCETIC evolutionary model for each selected regularization.

Examples

set.seed(12345)
data(datasetExampleSingleSamples)
data(ccfDatasetExampleSingleSamples)
data(vafDatasetExampleSingleSamples)
resExampleSingleSamplesResampling <- asceticCCFResampling(
                                               dataset = datasetExampleSingleSamples,
                                               ccfDataset = ccfDatasetExampleSingleSamples,
                                               vafDataset = vafDatasetExampleSingleSamples,
                                               nsampling = 5,
                                               regularization = "aic",
                                               command = "hc",
                                               restarts = 0 )
#> 0 
#> 0.2 
#> 0.4 
#> 0.6 
#> 0.8 
#> 1