Tim91 - 1 year ago 84
R Question

# Vectorization instead of looping in R

I want to improve my processing time by replacing some 'forloops' with a vectorized alternative.

In the following there is a simplified example of what I am going to do with a much bigger dataset.

``````df <- data.frame(time = c(10, 12, 14, 14, 14, 17, 23, 23, 30, 32), ranks = vector(mode = 'double', length = 10))

df_hilf <- data.frame(time_hilf = c(10, 12, 14, 17, 23, 30, 32), ranking_hilf = c(1, 2, 4, 6, 7.5, 9, 10))

for (j in 1:nrow(df_hilf)) {
df\$ranks[df\$time == df_hilf\$time_hilf[j]] <- df_hilf\$ranking_hilf[j]
}
``````

I've generated a dataframe called df which is ordered by time. The goal is to assign the ranks of another dataframe (in this example called df_hilf) to the initial dataframe.

As you can see the dataframes differ in length because in df_hilf only the unique times of df are stored.

The ranks stored in df_hilf are calculated by a specific rule (using adjusted ranks in reliability analysis). Just for simplicity I've used midranks in this example. Hence I really need this specific ranks stored in df_hilf.

At the end I want to have the same rank for same time values in df.

``````> df
time ranks
1    10   1.0
2    12   2.0
3    14   4.0
4    14   4.0
5    14   4.0
6    17   6.0
7    23   7.5
8    23   7.5
9    30   9.0
10   32  10.0
``````

I think this could work with the function
`replicate`
but I haven't found out how to set up the
`n`
argument, since the occurrences of same time values also differ.

Unfortunately I also have not found a solution to this problem on the net. I apologize if I have overlooked something.

You could use `match()`:

``````df\$ranks <- df_hilf\$ranking_hilf[match(df\$time, df_hilf\$time)]
#> df
#   time ranks
#1    10   1.0
#2    12   2.0
#3    14   4.0
#4    14   4.0
#5    14   4.0
#6    17   6.0
#7    23   7.5
#8    23   7.5
#9    30   9.0
#10   32  10.0
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download