Zach Eisner Zach Eisner - 2 months ago 9
R Question

List reformating in R

I have this df:

KEGGnumber Cor Colors
X1 C00095 -2.623973e-01 RED
X2 C17714, C00044 -2.241113e-01 RED
X3 C00033 -3.066684e-01 RED


and would like to format it as a two column dataframe with each individual
KEGGnumber
to be matched with its
Color
. It would look something like this:

KEGGnumber Colors
C00095 RED
C17714 RED
C00044 RED
C00033 RED


Essentially, the new dataframe take the rows of the old dataframe with more than one
KEGGnumber
and splits them up, while keeping the same
Color
for each.

Answer

tidyr makes this quite easy:

library(tidyr)

df %>% separate_rows(KEGGnumber)
##          Cor Colors KEGGnumber
## 1 -0.2623973    RED     C00095
## 2 -0.2241113    RED     C17714
## 3 -0.2241113    RED     C00044
## 4 -0.3066684    RED     C00033

Chop off the Cor column if you like.

A less-pretty base option:

do.call(rbind, 
        Map(function(x, y){data.frame(KEGGnumber = x, Colors = y)}, 
            strsplit(as.character(df$KEGGnumber), ', '), 
            df$Colors))
##   KEGGnumber Colors
## 1     C00095    RED
## 2     C17714    RED
## 3     C00044    RED
## 4     C00033    RED