r - How can I speed up this sapply for cross checking samples? -


i'm trying speed qc function checking similarity between samples. wanted know if there faster way compare way doing below? know there have been answers kind of question pretty definitive (on or otherwise) can't find them. know should investigate plyr i'm still getting hold of sapply.

the following sample data representative output of working randomized , don't think impact application original question.

## sample data  nsamples <- 1000 nsamplesqc <- 100 nassays <- 96 microarrayscores   <- matrix(sample(c("g:g", "t:g", "t:t", na),nsamples * nassays,replace = true), nrow = nsamples, ncol = nassays)  microarrayscoresqc <- matrix(sample(c("g:g", "t:g", "t:t", na),nsamples * nassays,replace = true), nrow = nsamples, ncol = nassays) mycombs <- data.frame(experiment = rep(1:nsamples,nsamplesqc),qc = sort(rep(1:nsamplesqc,nsamples)))  ## testing function system.time( sapply(seq(length(mycombs[,1])), function(x) {compare <- microarrayscores[mycombs[x,1],]==microarrayscoresqc[mycombs[x,2],];                                               sum(compare[!is.na(compare)])/sum(!is.na(compare))}) ) 

here vectorized version of code, 20 times faster on machine:

rowmeans(microarrayscores[mycombs[,1], ] ==          microarrayscoresqc[mycombs[,2], ], na.rm = true) 

Comments

Popular posts from this blog

Django REST Framework perform_create: You cannot call `.save()` after accessing `serializer.data` -

Why does Go error when trying to marshal this JSON? -