Rohit Kadam Rohit Kadam - 4 months ago 21
R Question

How to convert rows that contain same value for one column but different for other columns into one single row using R?

I have data as follows :

User_Id Website Day
A Google Monday
A Facebook Tuesday
A Linkedin Wednesday
B Facebook Tuesday
B Linkedin Wednesday


I want to achieve something like this:

User_ID Google Facebook Linkedin Monday Tuesday Wednesday
A 1 1 1 1 1 1
B 0 1 1 0 1 1


The columns now represent the number of times it is appearing for each users.
How can I do this in R ?

Answer

We unlist the 2nd and 3rd column of data.frame (unlist(df1[-1]) and replicate the 1st column by the number of other columns i.e. in this case 2 (rep(df1[,1], 2)), get the frequency count with table and convert to a data.frame (as.data.frame.matrix).

as.data.frame.matrix(table(rep(df1[,1],2), unlist(df1[-1])))
#  Facebook Google Linkedin Monday Tuesday Wednesday
#A        1      1        1      1       1         1
#B        1      0        1      0       1         1

If we need a package solution, another option is dplyr/tidyr. Reshape the 'wide' to 'long' format with gather (from tidyr), get the frequency count and spread back to 'wide' format.

library(dplyr)
library(tidyr)
df1 %>%
    gather(Var, Val, -User_Id) %>%
    count(User_Id, Val) %>% 
    spread(Val, n, fill = 0)   
#   User_Id Facebook Google Linkedin Monday Tuesday Wednesday
#    <chr>    <dbl>  <dbl>    <dbl>  <dbl>   <dbl>     <dbl>
#1       A        1      1        1      1       1         1
#2       B        1      0        1      0       1         1