B C B C - 23 days ago 10
R Question

Accessing from a split column

Here is how I split a column:

df <- data.frame('foo' = rep(c('ab','ac'), each = 5))
df <- within(df, boo <- data.frame(do.call('rbind', strsplit(as.character(df$foo),'',fixed=FALSE))))


Output:

foo boo.X1 boo.X2
1 ab a b
2 ab a b
3 ab a b
4 ab a b
5 ab a b
6 ac a c
7 ac a c
8 ac a c
9 ac a c
10 ac a c


However, when I try to access 'boo.X1' or 'boo.X2' I cannot. For example if we look at the names() it only lists two objects 'foo' and 'boo'.

names(df)
# [1] "foo" "boo"


Maybe I am missing something obvious, any help will be appreciated - thank you.

Answer

Having a look at the structure (str) of df you can see that boo itsself as a data.frame.

> str(df)
'data.frame':   10 obs. of  2 variables:
 $ foo: Factor w/ 2 levels "ab","ac": 1 1 1 1 1 2 2 2 2 2
 $ boo:'data.frame':    10 obs. of  2 variables:
  ..$ X1: Factor w/ 1 level "a": 1 1 1 1 1 1 1 1 1 1
  ..$ X2: Factor w/ 2 levels "b","c": 1 1 1 1 1 2 2 2 2 2

So you can access them by df$boo$X1 and df$boo$X1

if you want to append the boo columns you can use cbind as follows:

df <- data.frame('foo' = rep(c('ab','ac'), each = 5))
df <- cbind(df, do.call('rbind', strsplit(as.character(df$foo),'',fixed=FALSE)))
names(df) <- c("foo", "boo_1", "boo_2")

which gives you

   foo boo_1 boo_2
1   ab     a     b
2   ab     a     b
3   ab     a     b
4   ab     a     b
5   ab     a     b
6   ac     a     c
7   ac     a     c
8   ac     a     c
9   ac     a     c
10  ac     a     c
Comments