Melderon Melderon - 2 months ago 9
R Question

Convert Survival Fraction data to Binomial count data in R?

I have a data set that includes the amount of individuals from different genetic lines (line) of fruit flies in the study (n) and the amount that survived (alive). This is broken up into replicates (rep) That data frame looks like so:

line rep n alive trt
1 21 1 5 2 control
2 21 2 5 4 control
3 26 1 5 1 control
4 26 2 5 4 control


In order to fit a binomial model, I want to convert the fraction (alive/n) to count data. So far I have been doing this manually (which is very painstaking) creating a dataframe like this:

line rep trt surv
1 21 1 control 0
2 21 1 control 0
3 21 1 control 0
4 21 1 control 1
5 21 1 control 1
6 21 2 control 0
7 21 2 control 1
8 21 2 control 1
9 21 2 control 1
10 21 2 control 1
11 26 1 control 0
12 26 1 control 0
13 26 1 control 0
14 26 1 control 0
15 26 1 control 1
16 26 2 control 0
17 26 2 control 1
18 26 2 control 1
19 26 2 control 1
20 26 2 control 1


This allows me to create a model where survival is the response variable, the interaction between line and treatment (trt) is a major effect and rep is a random effect. The model works, the issue is how much time to takes to generate this (I have a total of 139 lines with 5 reps each). Can someone please help me either create a function, show me a function or a package that will help me? is there an easier way to do this?

Thanks in advance,

Phil

Answer

With your sample data

dd<-read.table(text="    line rep  n   alive    trt
1    21   1   5   2        control
2    21   2   5   4        control
3    26   1   5   1        control
4    26   2   5   4        control", header=T)

You can use dplyr and tidyr to help...

library(dplyr) library(tidyr)

dd %>% mutate(dead=n-alive) %>% select(-n) %>% 
    gather(status, count, c(alive,dead)) %>% 
    slice(rep(1:n(), .$count)) %>% select(-count) %>% 
    transform(surv=ifelse(status=="alive",1,0), status=NULL) %>%
    arrange(line, rep, trt, surv)

We use gather() to create separate rows for the surv=0 and surv=1 and we use slice() to replicate the desired rows.