sweetmusicality sweetmusicality - 6 months ago 75
R Question

Split one variable into multiple variables in R

I am relatively new to R. My question isn't entirely as straightforward as the title. This is a sample of what

df
looks like:

id amenities
1 wireless internet, air conditioning, pool, kitchen
2 pool, kitchen, washer, dryer
3 wireless internet, kitchen, dryer
4
5 wireless internet


this is what i want
df
to look like:

id wireless internet air conditioning pool kitchen washer dryer
1 1 1 1 1 0 0
2 0 0 1 1 1 1
3 1 0 0 1 0 1
4 0 0 0 0 0 0
5 1 0 0 0 0 0


sample code to reproduce data

df <- data.frame(id = c(1, 2, 3, 4, 5),
amenities = c("wireless internet, air conditioning, pool, kitchen",
"pool, kitchen, washer, dryer",
"wireless internet, kitchen, dryer",
"",
"wireless internet"),
stringsAsFactors = FALSE)

www www
Answer Source

A solution using dplyr and tidyr. Notice that I replace "" with None because it is easier to process the column names later.

library(dplyr)
library(tidyr)

df2 <- df %>%
  separate_rows(amenities, sep = ",") %>%
  mutate(amenities = ifelse(amenities %in% "", "None", amenities)) %>%
  mutate(value = 1) %>%
  spread(amenities, value , fill = 0) %>%
  select(-None)
df2
#   id  air conditioning  dryer  kitchen  pool  washer pool wireless internet
# 1  1                 1      0        1     1       0    0                 1
# 2  2                 0      1        1     0       1    1                 0
# 3  3                 0      1        1     0       0    0                 1
# 4  4                 0      0        0     0       0    0                 0
# 5  5                 0      0        0     0       0    0                 1
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download