Andrej Andrej - 3 years ago 149
R Question

How to avoid for loop when iterating through unique values in a column [R]

Let's assume that we have following toy data:

data <- tibble(
subject = c(1, 1, 1, 2, 2, 2, 2, 3, 3, 3),
id1 = c("a", "a", "b", "a", "a", "a", "b", "a", "a", "b"),
id2 = c("b", "c", "c", "b", "c", "d", "c", "b", "c", "c")

which represent network relationships for each subject. For example, there are three unique subjects in the data and the network for the first subject could be represented as sequence of relations:

a -- b, a --c, b -- c

The task is to compute centralities for each network. Using for loop this is straightforward:

# Get unique subjects
subjects_uniq <- unique(data$subject)

# Compute centrality of nodes for each graph
for (i in 1:length(subjects_uniq)) {
current_data <- data %>% filter(subject == i) %>% select(-subject)
current_graph <- current_data %>% graph_from_data_frame(directed = FALSE)
centrality <- eigen_centrality(current_graph)$vector

Question: My dataset is huge so I wonder how to avoid explicit
loop. Should I use
and its modern cousins (maybe
in the
package)? Any suggestions are greatly welcome.

Answer Source

Here is an option using map

map(subjects_uniq, ~data %>%
                    filter(subject == .x) %>%
                    select(-subject) %>%
                    graph_from_data_frame(directed = FALSE) %>% 
#a b c 
#1 1 1 

#        a         b         c         d 
#1.0000000 0.8546377 0.8546377 0.4608111 

#a b c 
#1 1 1 
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download