p3rand0r p3rand0r - 1 month ago 14
R Question

How to format describeBy table in R?

I have this data set:

Defects.I Defects.D Treatment
1 2 A
1 3 B


And I'm trying to do a descriptive statistics for defects detected and isolated, grouped per treatment.
After searching for a while I found a nice function on the
psych
library called describeBy().
With the following code:

describeBy(myData[1:2],myData$Treatment)


I got this output:

Treatment A
Mean. Median. Trimed.
Defects.I x x x
Defects.D x x x

Treatment B
Mean. Median. Trimed.
Defects.I x x x
Defects.D x x x


But in reality I was looking for something like

Mean. Median. Trimed.
A B A B A B
Defects.I x x x x x x
Defects.D x x x x x x


Data

myData <- structure(list(Defects.I = c(1L, 1L), Defects.D = 2:3, Treatment = c("A",
"B")), .Names = c("Defects.I", "Defects.D", "Treatment"), class = "data.frame", row.names = c(NA,
-2L))

Answer

Since describeBy returns a lists of data frames, we could just cbind them all, but that doesn't get the right order. Instead we can interleave the columns

myData <- structure(list(Defects.I = c(1L, 1L), Defects.D = 2:3,
                         Treatment = c("A", "B")),
                    .Names = c("Defects.I", "Defects.D", "Treatment"),
                    class = "data.frame", row.names = c(NA, -2L))

l <- psych::describeBy(myData[1:2], myData$Treatment)

So interleave using this order

order(sequence(c(ncol(l$A), ncol(l$B))))
# [1]  1 14  2 15  3 16  4 17  5 18  6 19  7 20  8 21  9 22 10 23 11 24 12 25 13 26

rather than what cbind alone would do

c(1:13, 1:13)
# [1]  1  2  3  4  5  6  7  8  9 10 11 12 13  1  2  3  4  5  6  7  8  9 10 11 12 13

so this

do.call('cbind', l)[, order(sequence(lengths(l)))]
#           A.vars B.vars A.n B.n A.mean B.mean A.sd B.sd A.median B.median A.trimmed B.trimmed A.mad B.mad
# Defects.I      1      1   1   1      1      1   NA   NA        1        1         1         1     0     0
# Defects.D      2      2   1   1      2      3   NA   NA        2        3         2         3     0     0
#           A.min B.min A.max B.max A.range B.range A.skew B.skew A.kurtosis B.kurtosis A.se B.se
# Defects.I     1     1     1     1       0       0     NA     NA         NA         NA   NA   NA
# Defects.D     2     3     2     3       0       0     NA     NA         NA         NA   NA   NA

or as a function

interleave <- function(l, how = c('cbind', 'rbind')) {
  how <- match.arg(how)
  if (how %in% 'rbind')
    do.call(how, l)[order(sequence(sapply(l, nrow))), ]
  else do.call(how, l)[, order(sequence(sapply(l, ncol))), ]
}

interleave(l)
#           A.vars B.vars A.n B.n
# Defects.I      1      1   1   1
# Defects.D      2      2   1   1 ...
# ...

interleave(l, 'r')
#             vars n mean sd median trimmed mad min max range skew kurtosis se
# A.Defects.I    1 1    1 NA      1       1   0   1   1     0   NA       NA NA
# B.Defects.I    1 1    1 NA      1       1   0   1   1     0   NA       NA NA
# A.Defects.D    2 1    2 NA      2       2   0   2   2     0   NA       NA NA
# B.Defects.D    2 1    3 NA      3       3   0   3   3     0   NA       NA NA
Comments