Skip to contents

Expression counts are processed using edgeR following User's Guide. Shortly, counts for each sample are filtered for lowly expressed promoters, normalized for the library size and transformed into counts per million (CPM). Optionally, CPM are log2 transformed with addition of pseudo count. Basal level expression is calculated by averaging base_lvl samples expression values.

Usage

prepareCountsForRegression(
  counts,
  design,
  base_lvl,
  log2 = TRUE,
  pseudo_count = 1L,
  drop_base_lvl = TRUE
)

Arguments

counts

matrix of read counts.

design

matrix giving the design matrix for the samples. Columns corresponds to samples groups and rows to samples names.

base_lvl

string indicating group in design corresponding to basal expression level. The reference samples to which expression change will be compared.

log2

logical flag indicating if counts should be log2(counts per million) should be returned.

pseudo_count

integer count to be added before taking log2.

drop_base_lvl

logical flag indicating if base_lvl samples should be dropped from resulting MultiAssayExperiment object.

Value

MultiAssayExperiment object with two experiments:

U

matrix giving expression values averaged over basal level samples

Y

matrix of expression values

design with base_lvl dropped is stored in metadata and directly available for modelGeneExpression.

Examples

data("rinderpest_mini")
base_lvl <- "00hr"
design <- matrix(
  data = c(1, 0, 0,
           1, 0, 0,
           1, 0, 0,
           0, 1, 0,
           0, 1, 0,
           0, 1, 0,
           0, 0, 1,
           0, 0, 1,
           0, 0, 1),
  ncol = 3,
  nrow = 9,
  byrow = TRUE,
  dimnames = list(colnames(rinderpest_mini), c("00hr", "12hr", "24hr")))
mae <- prepareCountsForRegression(
  counts = rinderpest_mini,
  design = design,
  base_lvl = base_lvl)