r - How can I speed up this sapply for cross checking samples? -
i'm trying speed qc function checking similarity between samples. wanted know if there faster way compare way doing below? know there have been answers kind of question pretty definitive (on or otherwise) can't find them. know should investigate plyr
i'm still getting hold of sapply
.
the following sample data representative output of working randomized , don't think impact application original question.
## sample data nsamples <- 1000 nsamplesqc <- 100 nassays <- 96 microarrayscores <- matrix(sample(c("g:g", "t:g", "t:t", na),nsamples * nassays,replace = true), nrow = nsamples, ncol = nassays) microarrayscoresqc <- matrix(sample(c("g:g", "t:g", "t:t", na),nsamples * nassays,replace = true), nrow = nsamples, ncol = nassays) mycombs <- data.frame(experiment = rep(1:nsamples,nsamplesqc),qc = sort(rep(1:nsamplesqc,nsamples))) ## testing function system.time( sapply(seq(length(mycombs[,1])), function(x) {compare <- microarrayscores[mycombs[x,1],]==microarrayscoresqc[mycombs[x,2],]; sum(compare[!is.na(compare)])/sum(!is.na(compare))}) )
here vectorized version of code, 20 times faster on machine:
rowmeans(microarrayscores[mycombs[,1], ] == microarrayscoresqc[mycombs[,2], ], na.rm = true)
Comments
Post a Comment