user1679161 - 3 months ago 19

R Question

I have a database like the following

`Var1 Var2 Var3 X Y Z`

VIX SPX VOL 2 3 4

SPX VIX NA 4 4 NA

SPX NA NA 2 NA NA

I want to isolate the first row with inputs from user as VIX, SPX and VOL. However, in a database like this I do not know in which combination these variables are kept in.

That is we might have VIX as Var1, SPX as Var2 and VOL as Var3 or we might have VOL as Var1, SPX as Var2 and VIX as Var3. In such a case 6 combinations are possible.

I can concatenate strings to create all 6 possibilities in R and do a row look up. But I am looking for an easier algorithm.

I am trying to use subset in to filter one column after another but that is equally cumbersome. Is there an easier way out in R

Answer

We can use `apply`

row-wise for the first three columns and check with `grepl`

if it has any of the three possible values and finally using `all`

we verify that all the three columns for that row has the accepted values.

```
df[apply(df[1:3], 1, function(x) all(grepl("VIX|SPX|VOL", x))), ]
# Var1 Var2 Var3 X Y Z
#1 VIX SPX VOL 2 3 4
```

Or as @Cath mentioned in the comments using the above can result into selecting some unnecessary rows. We can change it to

```
df[apply(df[1:3], 1, function(x) all(c("VIX", "SPX", "VOL") %in% x)), ]
```

to ensure that all three unique values are present in the three columns.