How to: Use fullname translators to rename data files
Q: I have download a set of HapMap CEL files and they all have filenames like NA07000_Hind_A8_3005533.CEL. How can I tell aroma.affymetrix that it is the first part NA07000 that is the sample name and the rest are tags? I do not want to rename my CEL files.
A: You can use custom translator functions to parse and translate the default fullnames into fullnames of your choice. For example, assume you have the following data set:
dsR <- AffymetrixCelSet$byName("HapMap,CEU,testSet", chipType="Mapping50K_Hind240")
print(getFullNames(dsR))
## [1] "NA06985_Hind_B5_3005533" "NA06991_Hind_B6_3005533"
## [3] "NA06993_Hind_B4_4000092" "NA06994_Hind_A7_3005533"
## [5] "NA07000_Hind_A8_3005533" "NA07019_Hind_A12_4000092"
## [7] "NA07022_Hind_A10_4000092" "NA07029_Hind_A9_4000092"
## [9] "NA07034_Hind_B1_4000092" "NA07048_Hind_B3_4000092"
The CEL files downloaded from HapMap has file names such as NA07000_Hind_A8_3005533.CEL. In order for aroma.affymetrix to identify 'NA07000' as the sample name, and 'A8' and '3005533' as tags (ignoring the 'Hind' part), use a function to translates the full name to a comma-separated fullname, e.g. 'NA07000_Hind_A8_3005533' to 'NA07000,A8,3005533'.
setFullNamesTranslator(dsR, function(names, ...) {
# Turn into comma-separated tags
names <- gsub("_", ",", names)
# Drop any Hind/Xba tags
names <- gsub(",(Hind|Xba)", "", names)
names
})
print(getFullNames(dsR))
## [1] "NA06985,B5,3005533" "NA06991,B6,3005533"
## [3] "NA06993,B4,4000092" "NA06994,A7,3005533"
## [5] "NA07000,A8,3005533" "NA07019,A12,4000092"
## [7] "NA07022,A10,4000092" "NA07029,A9,4000092"
## [9] "NA07034,B1,4000092" "NA07048,B3,4000092"
print(dsR)
## AffymetrixCelSet:
## Name: HapMap
## Tags: CEU,testSet
## Path: rawData/HapMap,CEU,testSet/Mapping50K_Hind240
## Platform: Affymetrix
## Chip type: Mapping50K_Hind240
## Number of arrays: 10
## Names: NA06985, NA06991, ..., NA07048
## Time period: 2004-01-14 14:02:08 -- 2004-02-13 11:51:01
## Total file size: 244.78MB
## RAM: 0.01MB