Z.Chanell Z.Chanell - 2 months ago 8
R Question

questionnaires filled on the same day

I am working with a data set with multiple questionnaires which were supposed to be filled in on different timepoints i.e.

173 9/13/2013 10/29/2013 9/26/2014
174 10/21/2013 11/25/2013 11/3/2014
175 7/1/2014 7/3/2015 4/27/2016
176 1/15/2014 2/24/2014 6/10/2015
177 3/15/2014 4/1/2015
178 7/18/2014 9/18/2014 8/17/2015
179 6/30/2013 8/15/2013 7/15/2014
180 4/22/2013 6/24/2013 5/11/2014
181 12/7/2014 12/26/2015
182 4/2/2015 5/17/2015 4/20/2016
183 1/12/2015 2/26/2015 1/28/2016
184 7/18/2014 8/26/2014 8/14/2015
185 8/27/2013 10/19/2013 9/21/2014
186 10/29/2013 11/30/2013 11/6/2014
187 9/17/2014 11/18/2014 10/20/2015
188 5/10/2014 6/27/2014 6/1/2015
189 10/4/2013 10/5/2014
190 1/22/2013 4/11/2013
191 10/21/2014 10/21/2014


I would like to know how to see how many participants filled in all questionnaires on the same day, how many participants filled in at least 2 questionnaires on the same day. how many at least 3 on the same day etc.
Any help would be highly appreciated.

Reproducible data:

Label = c(
"1/25/2015", "1/25/2016", "1/26/2014", "1/26/2015", "1/27/2014",
"1/27/2015", "1/28/2014", "1/28/2015", "1/29/2015", "1/3/2014",
"1/3/2015", "1/3/2016", "1/30/2015", "1/31/2014", "1/4/2014",
"1/4/2015", "1/4/2016", "1/5/2014", "1/5/2015", "1/6/2014",
"1/6/2015", "1/7/2014", "1/7/2015", "1/8/2014", "1/8/2015",
"1/9/2014", "1/9/2015", "1/9/2016", "10/1/2012", "10/1/2013",
"10/1/2014", "10/1/2015", "10/10/2013", "10/10/2014", "10/11/2013",
"10/11/2014", "10/11/2015", "10/12/2013", "10/12/2014", "10/12/2015",
"10/13/2013", "10/13/2014", "10/13/2015", "10/14/2013", "10/14/2014",
"10/14/2015", "10/15/2014", "10/15/2015", "10/16/2013", "10/16/2014",
"10/16/2015", "10/17/2013", "10/17/2014", "10/17/2015", "10/18/2013",
"10/18/2014", "10/18/2015", "10/19/2013", "10/19/2014", "10/19/2015",
"10/2/2013", "10/2/2014", "10/20/2013", "10/20/2014", "10/20/2015",
"10/21/2013", "10/21/2014", "10/22/2013", "10/22/2014", "10/22/2015",
"10/23/2012", "10/23/2013", "10/23/2014", "10/23/2015", "10/24/2013",
"10/24/2014", "10/24/2015", "10/25/2013", "10/25/2014", "10/26/2013",
"10/26/2014", "10/26/2015", "10/27/2013", "10/27/2014", "10/27/2015",
"10/28/2013", "10/28/2014", "10/29/2013", "10/29/2014", "10/3/2014",
"10/3/2015", "10/30/2014", "10/31/2012", "10/31/2013", "10/31/2014",
"10/31/2015", "10/4/2013", "10/4/2014", "10/4/2015", "10/5/2014",
"10/5/2015", "10/6/2013", "10/6/2014", "10/6/2015", "10/7/2013",
"10/7/2014", "10/8/2012", "10/8/2014", "10/8/2015", "10/9/2013",
"10/9/2014", "10/9/2015", "11/1/2013", "11/1/2014", "11/1/2015",
class = "factor")

Label = c(
"4/6/2015", "4/7/2015", "4/9/2012", "5/12/2015", "5/13/2014",
"5/14/2015", "5/15/2014", "5/15/2015", "5/17/2014", "5/19/2014",
"5/20/2014", "5/25/2014", "5/27/2014", "5/29/2014", "5/30/2014",
"5/30/2015", "5/31/2015", "5/4/2014", "5/9/2015", "6/1/2015",
"6/10/2014", "6/11/2014", "6/11/2015", "6/12/2015", "6/16/2014",
"6/16/2015", "6/18/2014", "6/21/2014", "6/24/2015", "6/25/2014",
"6/25/2015", "6/26/2015", "6/27/2015", "6/29/2015", "6/5/2014",
"6/6/2015", "6/8/2014", "7/1/2014", "7/13/2014", "7/14/2015",
"7/16/2014", "7/2/2014", "7/21/2014", "7/25/2014", "7/27/2014",
"7/27/2015", "7/28/2014", "7/29/2014", "7/30/2014", "7/31/2014",
"7/31/2015", "7/4/2014", "7/4/2015", "8/1/2014", "8/11/2014",
"8/11/2015", "8/25/2014", "8/27/2015", "8/5/2014", "8/8/2014",
"8/9/2015", "9/1/2014", "9/10/2015", "9/15/2015", "9/22/2013",
"9/3/2012", "9/30/2014", "9/8/2014", "9/8/2015"), class = "factor")

Label = c(" ",
"1/16/2016", "1/26/2015", "10/11/2015", "10/14/2015", "10/16/2015",
"10/6/2014", "10/7/2013", "11/11/2015", "11/15/2015", "11/17/2013",
"11/18/2013", "11/2/2015", "11/20/2013", "11/29/2013", "2/17/2014",
"2/17/2015", "2/21/2015", "2/23/2014", "2/25/2014", "2/25/2015",
"3/11/2016", "3/2/2014", "3/22/2015", "3/4/2014", "3/4/2016",
"4/11/2014", "4/12/2013", "4/18/2016", "4/21/2015", "4/23/2015",
"4/29/2015", "4/3/2015", "4/5/2016", "5/23/2015", "5/26/2015",
"5/27/2015", "5/28/2015", "5/29/2014", "5/29/2015", "5/8/2015",
"6/16/2015", "6/22/2015", "6/28/2015", "7/24/2015", "7/27/2015",
"7/4/2014", "7/8/2015", "9/14/2015", "9/15/2015", "9/16/2014",
"9/17/2014", "9/22/2014", "9/23/2014", "9/24/2014", "9/24/2015",
"9/26/2014", "9/28/2015", "9/30/2015", "9/9/2015"), class = "factor")), .Names = c("1A_RespDate",
"1B_RespDate", "1C_1_RespDate", "1C_2_RespDate",
"1C_RespDate", "2A_1_RespDate", "2A_RespDate", "2B_RespDate",
"2C_RespDate"), row.names = c(NA, -4831L), class = "data.frame")

Answer

I'll call you dataframe df:

sapply(apply(df,1,unique),length)

will give you the number of unique dates for each individual as a vector. The highest value is 7 and the min 1 (all questionnaires answered on the same day).

which(sapply(apply(df,1,unique),length)<7)

Will give you the index of the individuals who filled at least 2 questionnaires on the same day.

length(which(sapply(apply(df,1,unique),length)<7))

Will tell you how many individuals filled at least 2 questionnaires on the same day.

Edit: This is inelegant (there must be a cleaner way) but it seems to work

which(sapply(sapply(sapply(apply(df,1,table),function(x) x==Z),which),function(x) any(x>0)))

Z is to be set to the number of questionnaires filled on the same day.
Explaination:

apply(df,1,table)

gives a list with for each individual the unique dates and how many times they appear.

sapply(apply(df,1,table),function(x) x==Z)

will give you the same list with True/False on whether a date appears exactly Z times.

sapply(sapply(apply(df,1,table),function(x) x==Z),which)

will give either "interger(0)" or a positive integer which is the index of the date for the individual (it's not something we are interested in).

sapply(sapply(sapply(apply(df,1,table),function(x) x==Z),which),function(x) any(x>0))

will give a vector of True/False corresponding to the index of the individual then next step with "which" is to get the index for the True.
We therefore get the individuals for which a date appears exactly Z times.

Comments