user1679161 user1679161 - 1 month ago 6
R Question

Easier solution for searching across columns in data.frame

I have a database like the following

Var1 Var2 Var3 X Y Z
VIX SPX VOL 2 3 4
SPX VIX NA 4 4 NA
SPX NA NA 2 NA NA


I want to isolate the first row with inputs from user as VIX, SPX and VOL. However, in a database like this I do not know in which combination these variables are kept in.

That is we might have VIX as Var1, SPX as Var2 and VOL as Var3 or we might have VOL as Var1, SPX as Var2 and VIX as Var3. In such a case 6 combinations are possible.

I can concatenate strings to create all 6 possibilities in R and do a row look up. But I am looking for an easier algorithm.

I am trying to use subset in to filter one column after another but that is equally cumbersome. Is there an easier way out in R

Answer

We can use apply row-wise for the first three columns and check with grepl if it has any of the three possible values and finally using all we verify that all the three columns for that row has the accepted values.

df[apply(df[1:3], 1, function(x) all(grepl("VIX|SPX|VOL", x))), ]

#   Var1 Var2 Var3 X Y Z
#1  VIX  SPX  VOL  2 3 4

Or as @Cath mentioned in the comments using the above can result into selecting some unnecessary rows. We can change it to

df[apply(df[1:3], 1, function(x) all(c("VIX", "SPX", "VOL") %in% x)), ]

to ensure that all three unique values are present in the three columns.

Comments