dinesh khemani dinesh khemani - 1 year ago 47
R Question

I want to convert one of my column values into multiple variables in R

I have the data in below format

ID ID_2 Item ID
001 111 1111
001 111 1112
001 112 1113
001 112 1114
001 112 1115
001 112 1116
001 113 1117
008 222 1118
008 222 1119
008 223 1120
011 333 1121
012 444 1122
012 444 1123
012 444 1124
012 444 1125
012 444 1126
017 555 1127
017 555 1128
017 555 1129


but i want to convert item_ID into multiple variables based on the order they appear in ID_2 and ID_1.Somewhat like what i have mentioned below

ID ID_2 Item_1 Item_2 Item_3 Item_4 Item_5
001 111 1111 1112
001 112 1113 1114 1115 1116
001 113 1117
008 222 1118 1119
008 223 1120
011 333 1121
012 444 1122 1123 1124 1125 1126
017 555 1127 1128 1129


I have tried Concatenating ID and ID_2, then create a key number 1,2.. so that the key changes as soon as the Concatenated value changes. But I have 38L rows of data it keeps on going.

I would appreciate any help from a person who has dealt with similar problem before.Thanks

PKr PKr
Answer Source

A little long way using str_split_fixed and str_count from stringr library can be following:

library("stringr")

df1<- aggregate( Item_ID ~ ID_2 + ID ,data=df,paste0,collapse =",")
maxl <- max(str_count(df1$Item_ID,","))+1
splitcols <- str_split_fixed(df1$Item_ID,pattern=",",n=maxl)
setNames(data.frame(df1[,2:1],splitcols),c("ID","ID_2",paste0("Item_",1:maxl)))

Output:

#    ID ID_2 Item_1 Item_2 Item_3 Item_4 Item_5
# 1  1  111   1111   1112                     
# 2  1  112   1113   1114   1115   1116       
# 3  1  113   1117                            
# 4  8  222   1118   1119                     
# 5  8  223   1120                            
# 6 11  333   1121                            
# 7 12  444   1122   1123   1124   1125   1126
# 8 17  555   1127   1128   1129              
# > 
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download