LanieD - 4 months ago 36

R Question

I have an R data frame consisting of a single column, and lots of rows. Within this column are a number of individuals and their responses. I would like to reshape this data, with one row for each individual. However there is no ID variable, and the only pattern is that the last score for each individual is numeric. Hence you can deduce that what follows a number should be a new row.

Existing data format:

`alpha`

bravo

charlie

5

alpha

charlie

2

delta

1

dd <- data.frame(xx = c("alpha","bravo","charlie",5,"alpha","charlie",2,"delta",1))

I would like this data to be rearranged into one of the following forms, in order of most desirable to least desirable:

`alpha bravo charlie 5 # Best`

alpha charlie 2

delta 1

or

`alpha bravo charlie 5`

alpha charlie 2

delta 1

or

`alpha bravo charlie 5 # Worst but acceptable if above is not possible.`

alpha charlie 2

delta 1

Answer

The ideal one requires more information so that people know how the cells should be allocated. The second best can be achieved by the following.

```
x <- c('alpha',
'bravo',
'charlie',
'5',
'alpha',
'charlie',
'2',
'delta',
'1')
rowend <- grep("^[0-9]+$", x)
n <- length(rowend) # number of individuals
rowbegin <- c(1, head(rowend, n-1) + 1)
m <- max(rowend - rowbegin) + 1 # number of column
y <- Map(function(i, j) c(x[i:(j-1)], rep("", m - (j-i+1)), x[j]),
rowbegin, rowend)
as.data.frame(matrix(unlist(y), nrow = n, ncol = m, byrow = TRUE))
```