Melissa - 6 months ago 30

R Question

I've been trying to debug this for the past 2 days, applying all the possible fixes I found here on Stack Overflow, but I'm still getting various errors and I don't know what I can do anymore.

**dat** is a data frame with 3051 rows and 38 columns, taken from the golub dataset in the multtest library .

sample of dat:

`> dat[1:5, 1:5]`

V1 V2 V3 V4 V5

g1 -1.45769 -1.39420 -1.42779 -1.40715 -1.42668

g2 -0.75161 -1.26278 -0.09052 -0.99596 -1.24245

g3 0.45695 -0.09654 0.90325 -0.07194 0.03232

g4 3.13533 0.21415 2.08754 2.23467 0.93811

g5 2.76569 -1.27045 1.60433 1.53182 1.63728

I have this function defined:

`> wilcox.func <- function(x, s1, s2) {`

+ x1 <- x[s1]

+ x2 <- x[s2]

+ x1 <- as.numeric(x1)

+ x2 <- as.numeric(x2)

+ w.out <- wilcox.test(x1, x2, exact=F, alternative="two.sided", correct=T)

+ out <- as.numeric(w.out$statistic)

+ return(out) }

and I try to apply it with:

`> apply(dat, 1, wilcox.func, s1=c(1:27), s2=c(28:38))`

where I want to run the wilcox.test() function with the first 27 columns as x and the remaining columns as y (based off golub.cl). However, I get this error:

`Error in wilcox.test(x1, x2, exact = F, alternative = "two.sided", correct = T) :`

unused arguments (exact = F, alternative = "two.sided", correct = T)

Removing

Funnily enough at some point I also got the error

I've also tried lapply() and mapply(), but I get the same unused arguments error.

What I'm trying to achieve: the wilcox.test(), if I understand the problem correctly, should be applied to each row where the x vector is composed of columns 1 to 28 and the y vector columns 29 to 38.

I apologize if this is a stupid simple issue I'm missing. I just don't know what it is :(

Edit: this works now (as well as Parfait's code) after restarting R... sorry, that should've probably been something I tried first before posting this...

Answer

Consider `sapply()`

or `vapply()`

(to predefine output type) iterating across row numbers since you need to slice by column ranges for each row. Below uses sample data but adjust to full `.dat`

:

```
# READ IN SAMPLE dat
data ='
V0 V1 V2 V3 V4 V5
g1 -1.45769 -1.39420 -1.42779 -1.40715 -1.42668
g2 -0.75161 -1.26278 -0.09052 -0.99596 -1.24245
g3 0.45695 -0.09654 0.90325 -0.07194 0.03232
g4 3.13533 0.21415 2.08754 2.23467 0.93811
g5 2.76569 -1.27045 1.60433 1.53182 1.63728'
dat <- read.table(text=data, header=TRUE, stringsAsFactors=FALSE)
# ADJUSTED FUNCTION
wilcox.func <- function(s1, s2) {
x1 <- as.numeric(s1)
x2 <- as.numeric(s2)
w.out <- wilcox.test(x1, x2, exact=F, alternative="two.sided", correct=T)
out <- as.numeric(w.out$statistic)
return(out)
}
output <- sapply(seq_len(nrow(dat)), function(i)
wilcox.func(dat[i, c(2:4)], dat[i, c(5:6)]))
output
# [1] 2 4 4 3 3
output <- vapply(seq_len(nrow(dat)), function(i)
wilcox.func(dat[i, c(2:4)], dat[i, c(5:6)]),
numeric(1))
output
# [1] 2 4 4 3 3
```