The (fetch|with)_assay_data functions are some of the main workhose functions of the facile ecosystem. These calls enable you to retrieve raw and noramlized assay data from a FacileData container.

  samples = NULL,
  assay_name = ndefault_assay(x),
  normalized = FALSE,
  batch = NULL,
  main = NULL,
  as.matrix = FALSE,
  subset.threshold = 700,
  aggregate = FALSE, = "ewm",
  verbose = FALSE

# S3 method for facile_frame
  assay_name = NULL,
  normalized = TRUE,
  aggregate = FALSE, = "ewm",
  spread = TRUE,
  with_assay_name = FALSE,
  verbose = FALSE,
  .fds = fds(x)



A FacileDataSrote object, or facile_frame


a feature descriptor (data.frame with assay and feature_id columms)


a samples descriptor


the name of the assay to fetch data from. Defaults to the value of default_assay() for x. Must be a subset of assay_names(x).


return normalize or raw data values, defaults to FALSE. This is only really "functional" for for assay_type = "rnaseq" types of assays, where the normalized data is log2(CPM). These values can be tweaked with log = (TRUE|FALSE) and prior.count parameters, which can passed down internally to (eventually) edgeR::cpm().


The column names in sample_info that specify the batch covariates in the data that will be regressed out.


The name of a covaraite in sample_info that contains a known covariate that describes the "effect" of an experiment that should not be regressed out. Please refer to the Details section for more informaiton.


by default, the data is returned in a long-form tbl-like result. If set to TRUE, the data is returned as a matrix.


parameters to pass to normalization methods


sometimes fetching all the genes is faster than trying to subset. We have to figure out why that is, but I've previously tested random features of different lengths, and around 700 features was the elbow.

do you want individual level results or geneset scores? Use 'ewm' for eigenWeightedMean, and that's all.


A FacileDataSet object


character vector of feature_ids


Do you want gene symbols returned, too?


A tibble (lazy or not) with assay data.

a tbl-like result


fetch_assay_data(x, ...) will return the data in long form. with_assay_data(x, ...) is most typically used when you already have a dataset x (a facile_frame) that you want to decorate with more assay data. The assay data asked for will be appended on to x in wide format. Because fetch is (most often) used at a lower level of granularity, normalize is by default set to FALSE, while it is set to TRUE in with_assay_data.

Removing Batch Effects

When normalized data is returned, we assume these data are log-like, and you have the option to regress out batch effects using our remove_batch_effect() wrapper to limma::removeBatchEffect().


samples <- exampleFacileDataSet() %>% filter_samples(indication == "BLCA", sample_type == "tumor") features <- c(PRF1='5551', GZMA='3001', CD274='29126') dat <- with_assay_data(samples, features, normalized = TRUE, batch = "sex") dat <- with_assay_data(samples, features, normalized = TRUE, batch = c("sex", "stage")) dat <- with_assay_data(samples, features, normealized = TRUE, batch = c("sex", "stage"), main = "sample_type")