ROY ROY - 2 months ago 6
R Question

Segregating dataset and name each new dataset as per unique column names

I have a dataset(nm) as shown below:

nm

2_V2O 10_Kutti 14_DD 15_TT 16_DD 19_V2O 20_Kutti
0 1 1 0 0 1 0
1 1 1 1 1 0 0
0 1 0 1 0 0 1
0 1 1 0 1 0 0


Now I want to have multiple new datasets which got segregated as per their unique column names. All dataset names also must be created as per their column names as shown below:

Kutti
10_Kutti 20_Kutti
1 0
1 0
1 1
1 0

V2O
2_V2O 19_V2O
0 1
1 0
0 0
0 0

DD
14_DD 16_DD
1 0
1 1
0 0
1 1

TT
16_TT
0
1
0
1


I know this can be done using "select" function in dplyr but I need one dynamic programme which builds this automatically for any dataset.

Answer

We can split by the substring of the column names of 'nm'. Remove the prefix of the columnames until the _ with sub and use that to split the 'nm'.

lst <- split.default(nm, sub(".*_", "", names(nm)))
lst
#$DD
#  14_DD 16_DD
#1     1     0
#2     1     1
#3     0     0
#4     1     1

#$Kutti
#  10_Kutti 20_Kutti
#1        1        0
#2        1        0
#3        1        1
#4        1        0

#$TT
#  15_TT
#1     0
#2     1
#3     1
#4     0

#$V2O
#  2_V2O 19_V2O
#1     0      1
#2     1      0
#3     0      0
#4     0      0

It is better to keep the data.frames in a list. If we insist that it should be individual data.frame objects in the global environment (not recommended), use list2env

list2env(lst, envir = .GlobalEnv)

Now, just call

DD

data

nm <- structure(list(`2_V2O` = c(0L, 1L, 0L, 0L), `10_Kutti` = c(1L, 
1L, 1L, 1L), `14_DD` = c(1L, 1L, 0L, 1L), `15_TT` = c(0L, 1L, 
1L, 0L), `16_DD` = c(0L, 1L, 0L, 1L), `19_V2O` = c(1L, 0L, 0L, 
0L), `20_Kutti` = c(0L, 0L, 1L, 0L)), .Names = c("2_V2O", "10_Kutti", 
"14_DD", "15_TT", "16_DD", "19_V2O", "20_Kutti"), class = "data.frame",
row.names = c(NA, -4L))
Comments