Kim Jenkins Kim Jenkins - 2 months ago 6
R Question

nested If statement on dates

I have a dataframe

df
which is as follows.

Id ProcessDate
10 2011-12-29 14:14:00
11 2011-12-29 14:16:00
12 2011-12-29 14:14:00
13 2011-12-29 14:20:00
14 2011-12-29 14:49:00
15 2011-12-29 14:51:00
16 2011-12-29 14:53:00
17 2011-12-29 15:11:00
18 2011-12-29 15:13:00
19 2011-12-29 15:10:00
20 2011-12-29 15:21:00
21 2011-12-29 14:34:00
22 2011-12-29 15:26:00


I am trying to create a third column
Status
that will contain either one of these three values
{Before, during , after }
based on this condition.

if (df$ProcessDate < 2011-12-29 14:48:00)
then df$Status = "Before"
else if (df$ProcessDate > 2011-12-29 14:48:00 & df$ProcessDate < 2011-12-29 15:16:00)
then df$Status = "Between"
else df$Status = "After"


The final dataframe should look like this.

Id ProcessDate Status
10 2011-12-29 14:14:00 Before
11 2011-12-29 14:16:00 Before
12 2011-12-29 14:14:00 Before
13 2011-12-29 14:20:00 Before
14 2011-12-29 14:49:00 Between
15 2011-12-29 14:51:00 Between
16 2011-12-29 14:53:00 Between
17 2011-12-29 15:11:00 Between
18 2011-12-29 15:13:00 Between
19 2011-12-29 15:10:00 Between
20 2011-12-29 15:21:00 After
21 2011-12-29 14:34:00 After
22 2011-12-29 15:26:00 After


I tried few things and it didn't work, any help on this issue is much appreciated.

Answer

One of the possible solution is to convert your time into epoch value and then compare. This can be done by using as.integer(as.POSIXct("Time")) as shown below

df = NULL
df$ids = c(10, 11, 12, 13, 14, 15, 16, 17, 18,  19, 20, 21, 22)      
df$date = c('2011-12-29 14:14:00', '2011-12-29 14:16:00', '2011-12-29      14:14:00', '2011-12-29 14:20:00', '2011-12-29 14:49:00', '2011-12-29 14:51:00', '2011-12-29 14:53:00', '2011-12-29 15:11:00', '2011-12-29 15:13:00', '2011-12-29 15:10:00', '2011-12-29 15:21:00', '2011-12-29 14:34:00', '2011-12-29 15:26:00')
df = as.data.frame(df)
df$date = as.integer(as.POSIXct(df$date))

upper   = as.integer(as.POSIXct('2011-12-29 15:16:00'))
lower   = as.integer(as.POSIXct('2011-12-29 14:48:00'))

You will have the converted date column as below

> df
    ids       date
1   10 1325148240
2   11 1325148360
3   12 1325148240
4   13 1325148600
5   14 1325150340
6   15 1325150460
7   16 1325150580
8   17 1325151660
9   18 1325151780
10  19 1325151600
11  20 1325152260
12  21 1325149440
13  22 1325152560

Then you can simply perform a numeric comparison

for(i in c(1:nrow(df))){
    if(df$date[i] < lower)
            df$Status[i] = "Before"
    else if(df$date[i] > lower & df$date[i] < upper)
            df$Status[i] = "Between"
    else
            df$Status[i] = "After"
}

Resulting in the output

> df
    ids       date  Status
1   10 1325148240  Before
2   11 1325148360  Before
3   12 1325148240  Before
4   13 1325148600  Before
5   14 1325150340 Between
6   15 1325150460 Between
7   16 1325150580 Between
8   17 1325151660 Between
9   18 1325151780 Between
10  19 1325151600 Between
11  20 1325152260   After
12  21 1325149440  Before
13  22 1325152560   After
Comments