Andrej - 3 years ago 149
R Question

# How to avoid for loop when iterating through unique values in a column [R]

Let's assume that we have following toy data:

``````library(tidyverse)
data <- tibble(
subject = c(1, 1, 1, 2, 2, 2, 2, 3, 3, 3),
id1 = c("a", "a", "b", "a", "a", "a", "b", "a", "a", "b"),
id2 = c("b", "c", "c", "b", "c", "d", "c", "b", "c", "c")
)
``````

which represent network relationships for each subject. For example, there are three unique subjects in the data and the network for the first subject could be represented as sequence of relations:

``````a -- b, a --c, b -- c
``````

The task is to compute centralities for each network. Using for loop this is straightforward:

``````library(igraph)
# Get unique subjects
subjects_uniq <- unique(data\$subject)

# Compute centrality of nodes for each graph
for (i in 1:length(subjects_uniq)) {
current_data <- data %>% filter(subject == i) %>% select(-subject)
current_graph <- current_data %>% graph_from_data_frame(directed = FALSE)
centrality <- eigen_centrality(current_graph)\$vector
}
``````

Question: My dataset is huge so I wonder how to avoid explicit
`for`
loop. Should I use
`apply()`
and its modern cousins (maybe
`map()`
in the
`purrr`
package)? Any suggestions are greatly welcome.

Here is an option using `map`

``````library(tidyverse)
library(igraph)
map(subjects_uniq, ~data %>%
filter(subject == .x) %>%
select(-subject) %>%
graph_from_data_frame(directed = FALSE) %>%
{eigen_centrality(.)\$vector})
#[[1]]
#a b c
#1 1 1

#[[2]]
#        a         b         c         d
#1.0000000 0.8546377 0.8546377 0.4608111

#[[3]]
#a b c
#1 1 1
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download