Skip to main content

Replication: GCRMA (background, normalization & summarization)

Author: Mark Robinson and Henrik Bengtsson
Created: 2009-05-17
Last modified: 2010-02-05

Description

This test verifies that aroma.affymetrix can reproduce the GCRMA (Wu, Irizarry, Gentleman, Murillo, and Spencer, 2004) chip-effect estimates as estimated by the gcrma package.

Data set

rawData/  
  Affymetrix-HeartBrain/  
   HG-U133_Plus_2/  
    u1332plus_ivt_cerebellum_A.CEL [13555904 bytes]  
    u1332plus_ivt_cerebellum_B.CEL [13550687 bytes]  
    u1332plus_ivt_cerebellum_C.CEL [13551860 bytes]  
    u1332plus_ivt_heart_A.CEL      [13554731 bytes]  
    u1332plus_ivt_heart_B.CEL      [13553255 bytes]  
    u1332plus_ivt_heart_C.CEL      [13551203 bytes]

Source: Affymetrix Tissue samples, 2007. See Affymetrix data sets for chip type HG-U133_Plus_2.

Script

GCRMA estimates by aroma.affymetrix

library("aroma.affymetrix")
cs <- AffymetrixCelSet$byName("Affymetrix-HeartBrain", chipType="HG-U133_Plus_2")

# GCRMA background correction
bc <- GcRmaBackgroundCorrection(cs)
csB <- process(bc)

# RMA quantile normalization
qn <- QuantileNormalization(csB, typesToUpdate="pm")
csN <- process(qn)

# RMA probe summarization
plm <- RmaPlm(csN, flavor="oligo")
fit(plm)

# Extract chip effects on the log2 scale
ces <- getChipEffectSet(plm)
theta <- extractMatrix(ces)
rownames(theta) <- getUnitNames(cdf)
theta <- log2(theta)

GCRMA estimate by gcrma

library("gcrma");  # gcrma()
raw <- ReadAffy(filenames=getPathnames(cs))
es <- gcrma(raw, verbose=TRUE)
theta0 <- exprs(es)

Results

# Reorder the aroma.affymetrix estimates
o <- match(rownames(theta0), rownames(theta))
theta <- theta[o,]

# (a) Assert correlations
cors <- sapply(1:ncol(theta), FUN=function(cc) cor(theta[,cc],
theta0[,cc]))
print(cors)
print(range(cors))
stopifnot(all(cors > 0.99995))

# (b) Assert differences
e <- (theta - theta0)
stopifnot(mean(as.vector(e\^2)) < 1e-3)
stopifnot(sd(as.vector(e\^2)) < 1e-3)
stopifnot(quantile(abs(e), 0.99) < 0.05)
stopifnot(max(abs(e)) < 0.085)

# (c) Visual comparison
layout(matrix(1:9, ncol=3, byrow=TRUE))
xlab <- expression(log[2](theta[gcrma]))
ylab <- expression(log[2](theta[aroma.affymetrix]))
for (kk in seq(length=ncol(theta))) {
  main <- colnames(theta)[kk]
  plot(theta0[,kk], theta[,kk], pch=".", xlab=xlab, ylab=ylab,
main=main)
  abline(0,1, col="blue")
}
xlab <- expression(log[2](theta[aroma.affymetrix]/theta[gcrma]))
plotDensity(e, xlab=xlab);

Figure: (Top six panels): Scatter plots comparing the chip-effect estimates (on the log2 scale) from aroma.affymetrix with the ones from gcrma. (Bottom panel): The density of the log2-ratios between aroma.affymetrix and gcrma chip-effect estimates for the six arrays.

References

[1] Z. Wu, R. Irizarry, R. Gentleman, et al. A Model Based Background Adjustment for Oligonucleotide Expression Arrays. Johns Hopkins University Dept. of Biostatistics Working Paper Series 1001. Berkeley Electronic Press, Jul. 2004. URL: http://ideas.repec.org/p/bep/jhubio/1001.html.

Appendix

Session information

R version 2.10.1 Patched (2010-01-12 r50990)  
i386-pc-mingw32

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] hgu133plus2probe_2.5.0 AnnotationDbi_1.8.1
oligoClasses_1.8.0
[4] hgu133plus2cdf_2.5.0   preprocessCore_1.8.0   gcrma_2.18.1
[7] Biobase_2.6.1          aroma.affymetrix_1.4.3 aroma.apd_0.1.7
[10] affxparser_1.18.0      R.huge_0.2.0           aroma.core_1.4.4
[13] aroma.light_1.15.1     matrixStats_0.1.9      R.rsp_0.3.6
[16] R.filesets_0.7.4       digest_0.4.2           R.cache_0.2.0
[22] oligo_1.10.0           affyPLM_1.22.0         affy_1.24.2
[25] R.methodsS3_1.1.0

loaded via a namespace (and not attached):
[1] affyio_1.14.0      Biostrings_2.14.11 DBI_0.2-4
IRanges_1.4.11
[5] RSQLite_0.7-1      splines_2.10.1     tools_2.10.1i386-pc-mingw32

Redundancy test script

The above script is part of redundancy test suite executed at every new release:

path <- system.file("testScripts/replication/chipTypes/HG-U133_Plus_2", package="aroma.affymetrix")
filename <- "12.doGCRMA_vs_gcrma.R"
pathname <- file.path(path, filename)
source(pathname)