Construct a bioconductor classed object from an analysis.
# S3 method for FacileLinearModelDefinition
biocbox(
x,
assay_name = NULL,
method = NULL,
features = NULL,
filter = "default",
filter_universe = NULL,
filter_require = NULL,
with_sample_weights = FALSE,
weights = NULL,
block = NULL,
prior_count = 0.1,
...
)
# S3 method for FacileDgeAnalysisResult
biocbox(x, cached = TRUE, ...)the name of the assay to pull data for
the name of the dge method that will be used. This will dictate the post-processing of the data
A filtering policy to remove unintereesting genes.
If "default" (which is the default), then edgeR::filterByExpr() is
used if we are materializing a DGEList, otherwise lowly expressed
features are removed in a similarly "naive" manner. This can,
alternatively, be a character vector that holds the names of the features
that should be kept. Default value: "default".
Some methods that leverage the limma pipeline,
like "voom", "limma", and "limma-trend" can leverage sample (array)
quality weights to downweight outlier samples. In the case of
method == "voom", we use limma::voomWithQualityWeights(), while the
rest use limma::arrayWeights(). The choice of method determines which
sample weighting function to sue. Defaults to FALSE.
The pseudo-count to add to count data. Used primarily
when running the limma-trend method on count (RNA-seq) data.
passed down to internal modeling and filtering functions.
a facile_frame that enumerates the samples to fetch
data for, as well as the covariates used in downstream analysis
a DGEList or EList with assay data in the correct place, and all of
the covariates in the $samples or $targerts data.frame that are requied
to test the model in mdef.
This function accepts a model defined using using flm_def() and
creates the appropriate Bioconductor assay container to test the model
given the assay_name and dge method specified by the user.
This function currently supports retrieving data and whipping it into a DGEList (for count-like data) and an EList for data that can be analyzed with one form limma or another.
Assumptions on different assay_type values include:
rnaseq: assumed to be "vanilla" bulk rnaseq gene counts
umi: data from bulk rnaseq, UMI data, like quantseq
tpm: TPM values. These will be log2(TPM + prior_count) transformed,
then differentially tested using the limma-trended pipeline
TODO: support affymrna, affymirna, etc. assay types
The "filter" parameters are described in the fdge() function for now.
Given a FacileDgeAnalysisResult, we can re-materialize the Bioconductor assay
container used within the differential testing pipeline used from fdge().
Currently we have limited our analysis framework to either work over DGEList
(edgeR) or EList (limma) containers.