Builds (simple) design and contrast matrices for use with fdge()

This simplifies the design and contrast building process by allowing for simple model definitions that are, essentially, functions of a single covariate. More elaborate models can be analysed, but the user is left to define the design, coef / contrast to test manually and pass those into fdge().

flm_def(
  x,
  covariate,
  numer = NULL,
  denom = NULL,
  batch = NULL,
  block = NULL,
  on_missing = c("warning", "error"),
  ...
)

# S3 method for data.frame
flm_def(
  x,
  covariate,
  numer = NULL,
  denom = NULL,
  batch = NULL,
  block = NULL,
  on_missing = c("warning", "error"),
  ...,
  contrast. = NULL,
  .fds = NULL
)

# S3 method for tbl
flm_def(
  x,
  covariate,
  numer = NULL,
  denom = NULL,
  batch = NULL,
  block = NULL,
  on_missing = c("warning", "error"),
  ...
)

# S3 method for facile_frame
flm_def(
  x,
  covariate,
  numer = NULL,
  denom = NULL,
  batch = NULL,
  block = NULL,
  on_missing = c("warning", "error"),
  ...,
  custom_key = NULL
)

# S3 method for FacileDataStore
flm_def(
  x,
  covariate,
  numer = NULL,
  denom = NULL,
  batch = NULL,
  block = NULL,
  on_missing = c("warning", "error"),
  ...,
  samples = NULL,
  custom_key = NULL
)

# S3 method for ReactiveFacileDataStore
flm_def(
  x,
  covariate,
  numer = NULL,
  denom = NULL,
  batch = NULL,
  block = NULL,
  on_missing = c("warning", "error"),
  ...,
  samples = active_samples(x),
  custom_key = user(x)
)

Arguments

x: a dataset
covariate: the name of the "main effect" sample_covariate we are performing a contrast against.
numer: character vector defining the covariate/groups that make up the numerator
denom: character vector defining the covariate/groups that make up the denominator
batch: character vector defining the covariate/groups to use as batch effects
block: a string that names the covariate to use for the blocking factor in a random effects model.
on_missing: when a covariate level is missing (NA) for a sample, the setting of this parameter (default "warn") will dictate the behavior of this funciton. When "warning", a warning will be raised, and the message will be stored in the $warning element of the resul. Otherwise, when "error". See the "Missing Covariates" section for more information.
contrast.: A custom contrast vector can be passed in for extra tricky comparisons that we haven't figured out how to put a GUI in front of.

Value

a list with:

$test: "ttest" or "anova"
$covariates: the pData over the samples (datset,sample_id, ...)
$design: the design matrix (always 0-intercept)
$contrast: the contrast vector that defines the comparison asked for
$messages: A character vector of messages generated
$warnings: A character vector of warnings generated
$errors: A character vector of errors generated

Details

Note: actually a (likely) small modification of this can have it support the "ratio of ratios" model setup.

Missing Covariates

Given the "ragged" nature of sample annotations in a FacileDataStore, some samples may have NA's as their values for the covariates under test. In this case. In this case, if on_missing is set to "error", an error will be thrown, otherwise a message will be set in the warning list element.

The samples that the differential expression should be run on will be enumerated by the (dataset,sample_id) pair in the result$covariates tibble.

data.frame

The *.data.frame function definition assumes that x is a data.frame of samples (dataset,sample_id) and the covariates defined on these samples (ie. all the other columns of x) contain a superset of the variable names used in the construction of the design matrix for the model definition.

facile_frame

When we define a model off of a facile_frame, we expect this to look like a wide covariate table. This defines the samples we will build a model on in its (datset, sample_id) columns, as well as any covaraites defined on these samples.

If there are covariates used in the covariate or batch parameters that are not found in colnames(x), we will attempt to retrieve them from the FacileDataStore fds(x). If they cannot be found, this function will raise an error.

Examples

efds <- FacileData::exampleFacileDataSet()

# Look for tumor vs normal differences, controling for stage and sex
model_info <- efds %>%
  FacileData::filter_samples(indication == "BLCA") %>%
  flm_def(covariate = "sample_type", numer = "tumor", denom = "normal",
          batch = "sex")
m2 <- efds %>%
  FacileData::filter_samples(indication == "BLCA") %>%
  flm_def(covariate = "sample_type", numer = "tumor", denom = "normal",
          batch = c("sex", "stage"))

# stageIV vs stageII & stageIII
m3 <- efds %>%
  FacileData::filter_samples(indication == "BLCA", sample_type == "tumor") %>%
  flm_def(covariate = "stage", numer = "IV", denom = c("II", "III"),
          batch = "sex")

# Incomplete ttest to help with custom contrast vector
mi <- efds %>%
  FacileData::filter_samples(indication == "BLCA", sample_type == "tumor") %>%
  flm_def(covariate = "stage", batch = "sex", contrast. = "help")
#> Error in cm[, j] <- eval(ej, envir = levelsenv): number of items to replace is not a multiple of replacement length

# ANOVA across stage in BLCA, control for sex
m3 <- efds %>%
  FacileData::filter_samples(indication == "BLCA") %>%
  flm_def(covariate = "stage", batch = "sex")