Daniel Mahler - 3 months ago 12

R Question

I have 2 functions that I use inside a mutate call. One produces per row results as expected while the other repeats the same value for all rows:

`library(dplyr)`

df <- data.frame(X = rpois(5, 10), Y = rpois(5,10))

pv <- function(a, b) {

fisher.test(matrix(c(a, b, 10, 10), 2, 2),

alternative='greater')$p.value

}

div <- function(a, b) a/b

mutate(df, d = div(X,Y), p = pv(X, Y))

which produces something like:

`X Y d p`

1 9 15 0.6000000 0.4398077

2 8 7 1.1428571 0.4398077

3 9 14 0.6428571 0.4398077

4 11 15 0.7333333 0.4398077

5 11 7 1.5714286 0.4398077

ie the

`d`

`v`

`X`

`Y`

I suspect this relates to NSE, but I do not undertand how from what litlle I have been able to find out about it.

What accounts for the different behaviours of

`div`

`pv`

`pv`

Answer

We need `rowwise`

```
df %>%
rowwise() %>%
mutate(d = div(X,Y), p = pv(X,Y))
# X Y d p
# <int> <int> <dbl> <dbl>
#1 10 9 1.111111 0.5619072
#2 12 8 1.500000 0.3755932
#3 9 8 1.125000 0.5601923
#4 11 16 0.687500 0.8232217
#5 16 10 1.600000 0.3145350
```

In the OP's code, the `pv`

is taking the 'X' and 'Y' columns as input and it gives a single output.

Or as @Frank mentioned, `mapply`

can be used

```
df %>%
mutate(d = div(X,Y), p = mapply(pv, X, Y))
```

Source (Stackoverflow)

Comments