Andrej Andrej - 3 years ago 149
R Question

How to avoid for loop when iterating through unique values in a column [R]

Let's assume that we have following toy data:

library(tidyverse)
data <- tibble(
subject = c(1, 1, 1, 2, 2, 2, 2, 3, 3, 3),
id1 = c("a", "a", "b", "a", "a", "a", "b", "a", "a", "b"),
id2 = c("b", "c", "c", "b", "c", "d", "c", "b", "c", "c")
)


which represent network relationships for each subject. For example, there are three unique subjects in the data and the network for the first subject could be represented as sequence of relations:

a -- b, a --c, b -- c


The task is to compute centralities for each network. Using for loop this is straightforward:

library(igraph)
# Get unique subjects
subjects_uniq <- unique(data$subject)

# Compute centrality of nodes for each graph
for (i in 1:length(subjects_uniq)) {
current_data <- data %>% filter(subject == i) %>% select(-subject)
current_graph <- current_data %>% graph_from_data_frame(directed = FALSE)
centrality <- eigen_centrality(current_graph)$vector
}


Question: My dataset is huge so I wonder how to avoid explicit
for
loop. Should I use
apply()
and its modern cousins (maybe
map()
in the
purrr
package)? Any suggestions are greatly welcome.

Answer Source

Here is an option using map

library(tidyverse)
library(igraph)
map(subjects_uniq, ~data %>%
                    filter(subject == .x) %>%
                    select(-subject) %>%
                    graph_from_data_frame(directed = FALSE) %>% 
                    {eigen_centrality(.)$vector})
#[[1]]
#a b c 
#1 1 1 

#[[2]]
#        a         b         c         d 
#1.0000000 0.8546377 0.8546377 0.4608111 

#[[3]]
#a b c 
#1 1 1 
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download