Donkeykongy Donkeykongy - 2 months ago 15
R Question

Getting specific rows in a data.table with specific criteria

I would like a subset of specific rows of a dataset that meet certain criteria, and this is the criteria i would like to run

for (i in 1:(nrow(DT)-1)){
if(DT$IBTKR2[i]==DT$IBTKR2[i+1]&
DT$AMASKCD[i]==DT$AMASKCD[i+1]&
DT$IRECCD[i]!=3&DT$IRECCD[i+1]==3){
form a subset that includes rows DT[i+1]}}


Essentially what it means is that if IBTKR2 for the row and the row below it is equal, the look at AMASKCD, if the row and the row below it is equal3, look at IRECCD, if IRECCD for the row is not equals to 3 and the row below it is equals to 3, I would like to take this row and for a subset.

below is a sample of my dataset

Row IBTKR2 AMASKCD IRECCD ANNDATS
1 @0CC 71476 1 20000704
2 @0CC 71476 1 20001204
3 @0CF 19813 3 20000831
4 @0CF 47104 3 20000420
5 @0CF 47340 3 20000418
6 @0CF 48938 3 20000821
7 @0CF 56117 2 20000330
8 @0CF 56117 3 20000413
9 @0CF 56117 2 20000526
10 @0CF 56117 3 20000713
11 @0CF 56117 2 20000801
12 @0CF 56117 3 20000804
13 @0CF 58875 3 20000609
14 @0CF 58875 1 20000822
15 @0CF 74030 3 20001027


and i should get a subset of

Row IBTKR2 AMASKCD IRECCD ANNDATS
8 @0CF 56117 3 20000413
10 @0CF 56117 3 20000713
12 @0CF 56117 3 20000804

Answer

We can use shift to get the next/previous row by using the type argument, convert the logical vector to row index (.I) and extract the rows.

library(data.table)
setDT(DT) #in case the dataset is not a `data.table`
DT[DT[, .I[IBTKR2==shift(IBTKR2, type = "lead", fill= IBTKR2[1]) & 
        AMASKCD == shift(AMASKCD, type = "lead", fill = AMASKCD[1]) & 
        (IRECCD !=3) &  shift(IRECCD, type = "lead", fill= IRECCD[1])==3]]+1]
#   Row IBTKR2 AMASKCD IRECCD  ANNDATS
#1:   8   @0CF   56117      3 20000413
#2:  10   @0CF   56117      3 20000713
#3:  12   @0CF   56117      3 20000804
Comments