Toni Toni - 3 months ago 25
R Question

Rank() in R excluding zeros

I am trying to duplicated "manually" the example in this Wikipedia post using R.

Here is the data:

after = c(125, 115, 130, 140, 140, 115, 140, 125, 140, 135)
before = c(110, 122, 125, 120, 140, 124, 123, 137, 135, 145)
sgn = sign(after-before)
abs = abs(after - before)
d = data.frame(after,before,sgn,abs)

after before sgn abs
1 125 110 1 15
2 115 122 -1 7
3 130 125 1 5
4 140 120 1 20
5 140 140 0 0
6 115 124 -1 9
7 140 123 1 17
8 125 137 -1 12
9 140 135 1 5
10 135 145 -1 10


If I try to rank the rows based on the
abs
column, the
0
entry is naturally ranked as
1
:

rank = rank(abs)
(d = data.frame(after,before,sgn,abs,rank))

after before sgn abs rank
1 125 110 1 15 8.0
2 115 122 -1 7 4.0
3 130 125 1 5 2.5
4 140 120 1 20 10.0
5 140 140 0 0 1.0
6 115 124 -1 9 5.0
7 140 123 1 17 9.0
8 125 137 -1 12 7.0
9 140 135 1 5 2.5
10 135 145 -1 10 6.0


However, zeros are ignored in the Wilcoxon signed-test.

How can I get R to ignore that row, so as to end up with:

after before sgn abs rank
1 125 110 1 15 7.0
2 115 122 -1 7 3.0
3 130 125 1 5 1.5
4 140 120 1 20 9.0
5 140 140 0 0 0
6 115 124 -1 9 4.0
7 140 123 1 17 8.0
8 125 137 -1 12 6.0
9 140 135 1 5 1.5
10 135 145 -1 10 5.0





SOLUTION (accepted answer below):

after = c(125, 115, 130, 140, 140, 115, 140, 125, 140, 135)
before = c(110, 122, 125, 120, 140, 124, 123, 137, 135, 145)
sgn = sign(after-before)
abs = abs(after - before)
d = data.frame(after,before,sgn,abs)
d$rank = rank(replace(abs,abs==0,NA), na='keep')
d$multi = d$sgn * d$rank

(W=abs(sum(d$multi, na.rm = T)))
9

Answer

From the Wikipedia article:

  1. Exclude pairs with |x2,ix1,i| = 0. Let Nr be the reduced sample size.

We need to exclude zeroes. By my thinking, you should replace zeroes with NA, and then specify to rank() that you want to exclude NAs from consideration for ranking. Since you need to return a vector of the same length as the input, you can specify 'keep' as the argument:

d$rank <- rank(replace(abs,abs==0,NA),na='keep');
d;
##    after before sgn abs rank
## 1    125    110   1  15  7.0
## 2    115    122  -1   7  3.0
## 3    130    125   1   5  1.5
## 4    140    120   1  20  9.0
## 5    140    140   0   0   NA
## 6    115    124  -1   9  4.0
## 7    140    123   1  17  8.0
## 8    125    137  -1  12  6.0
## 9    140    135   1   5  1.5
## 10   135    145  -1  10  5.0

The subtraction-based solutions will not work if the input vector contains zero zeroes or multiple zeroes.