r - How can I speed up this sapply for cross checking samples? -

- July 15, 2012

i'm trying speed qc function checking similarity between samples. wanted know if there faster way compare way doing below? know there have been answers kind of question pretty definitive (on or otherwise) can't find them. know should investigate plyr i'm still getting hold of sapply.

the following sample data representative output of working randomized , don't think impact application original question.

## sample data  nsamples <- 1000 nsamplesqc <- 100 nassays <- 96 microarrayscores   <- matrix(sample(c("g:g", "t:g", "t:t", na),nsamples * nassays,replace = true), nrow = nsamples, ncol = nassays)  microarrayscoresqc <- matrix(sample(c("g:g", "t:g", "t:t", na),nsamples * nassays,replace = true), nrow = nsamples, ncol = nassays) mycombs <- data.frame(experiment = rep(1:nsamples,nsamplesqc),qc = sort(rep(1:nsamplesqc,nsamples)))  ## testing function system.time( sapply(seq(length(mycombs[,1])), function(x) {compare <- microarrayscores[mycombs[x,1],]==microarrayscoresqc[mycombs[x,2],];                                               sum(compare[!is.na(compare)])/sum(!is.na(compare))}) )

here vectorized version of code, 20 times faster on machine:

rowmeans(microarrayscores[mycombs[,1], ] ==          microarrayscoresqc[mycombs[,2], ], na.rm = true)

Search This Blog

Arrya Code

r - How can I speed up this sapply for cross checking samples? -

Comments

Post a Comment

Popular posts from this blog

ios - Memory not freeing up after popping viewcontroller using ARC -

Django REST Framework perform_create: You cannot call `.save()` after accessing `serializer.data` -

webstorm - PhpStorm file cache conflict with TypeScript compiler -