useR - 1 year ago 63

R Question

`?sort`

`partial`

`NULL`

I tried:

`x <- c(1,3,5,2,4,6,7,9,8,10)`

sort(x)

## [1] 1 2 3 4 5 6 7 8 9 10

sort(x, partial=5)

## [1] 1 3 4 2 5 6 7 9 8 10

sort(x, partial=2)

## [1] 1 2 5 3 4 6 7 9 8 10

sort(x, partial=4)

## [1] 1 2 3 4 5 6 7 9 8 10

I am not sure what

`partial`

Answer Source

As `?sort`

states,

If partial is not NULL, it is taken to contain indices of elements of the result which are to be placed in their correct positions in the sorted array by partial sorting.

In other words, the following assertion is always true:

```
stopifnot(sort(x, partial=pt_idx)[pt_idx] == sort(x)[pt_idx])
```

for any `x`

and `pt_idx`

, e.g.

```
x <- sample(100) # input vector
pt_idx <- sample(1:100, 5) # indices for partial arg
```

This behavior is different from the one defined in the Wikipedia article on partial sorting. In R `sort()`

's case we are not necessarily computing *k* smallest elements.

For example, if

```
print(x)
## [1] 91 85 63 80 71 69 20 39 78 67 32 56 27 79 9 66 88 23 61 75 68 81 21 90 36 84 11 3 42 43
## [31] 17 97 57 76 55 62 24 82 28 72 25 60 14 93 2 100 98 51 29 5 59 87 44 37 16 34 48 4 49 77
## [61] 13 95 31 15 70 18 52 58 73 1 45 40 8 30 89 99 41 7 94 47 96 12 35 19 38 6 74 50 86 65
## [91] 54 46 33 22 26 92 53 10 64 83
```

and

```
pt_idx
## [1] 5 54 58 95 8
```

then

```
sort(x, partial=pt_idx)
## [1] 1 3 2 4 5 6 7 8 11 12 9 10 13 15 14 16 17 18 23 30 31 27 21 32 36 34 35 19 20 37
## [31] 38 33 29 22 26 25 24 28 39 41 40 42 43 48 46 44 45 47 51 50 52 49 53 54 57 56 55 58 59 60
## [61] 62 64 63 61 65 66 70 72 73 69 68 71 67 79 78 82 75 81 80 77 76 74 89 85 88 87 83 84 86 90
## [91] 92 93 91 94 95 96 97 99 100 98
```

Here `x[5]`

, `x[54]`

, ..., `x[8]`

are placed in their correct positions - and we cannot say anything else about the remaining elements. HTH.

**EDIT**: Partial sorting may reduce the sorting time, of course if you are interested in e.g. finding only some of the order statistics.

```
require(microbenchmark)
x <- rnorm(100000)
microbenchmark(sort(x, partial=1:10)[1:10], sort(x)[1:10])
## Unit: milliseconds
## expr min lq median uq max neval
## sort(x, partial = 1:10)[1:10] 2.342806 2.366383 2.393426 3.631734 44.00128 100
## sort(x)[1:10] 16.556525 16.645339 16.745489 17.911789 18.13621 100
```