Process count matrix for expression modeling
prepareCountsForRegression.Rd
Expression counts are processed using edgeR following
User's Guide.
Shortly, counts for each sample are filtered for lowly expressed promoters,
normalized for the library size and transformed into counts per million (CPM).
Optionally, CPM are log2 transformed with addition of pseudo count. Basal
level expression is calculated by averaging base_lvl
samples
expression values.
Usage
prepareCountsForRegression(
counts,
design,
base_lvl,
log2 = TRUE,
pseudo_count = 1L,
drop_base_lvl = TRUE
)
Arguments
- counts
matrix of read counts.
- design
matrix giving the design matrix for the samples. Columns corresponds to samples groups and rows to samples names.
- base_lvl
string indicating group in
design
corresponding to basal expression level. The reference samples to which expression change will be compared.- log2
logical flag indicating if counts should be log2(counts per million) should be returned.
- pseudo_count
integer count to be added before taking log2.
- drop_base_lvl
logical flag indicating if
base_lvl
samples should be dropped from resulting MultiAssayExperiment object.
Value
MultiAssayExperiment object with two experiments:
- U
matrix giving expression values averaged over basal level samples
- Y
matrix of expression values
design with base_lvl
dropped is stored in metadata and directly
available for modelGeneExpression
.
Examples
data("rinderpest_mini")
base_lvl <- "00hr"
design <- matrix(
data = c(1, 0, 0,
1, 0, 0,
1, 0, 0,
0, 1, 0,
0, 1, 0,
0, 1, 0,
0, 0, 1,
0, 0, 1,
0, 0, 1),
ncol = 3,
nrow = 9,
byrow = TRUE,
dimnames = list(colnames(rinderpest_mini), c("00hr", "12hr", "24hr")))
mae <- prepareCountsForRegression(
counts = rinderpest_mini,
design = design,
base_lvl = base_lvl)