Patrick Williams Patrick Williams - 3 years ago 234
R Question

running apply (or variant) like an embedded loop

I'd like to do something like an embedded loop, but using apply functions, the goal of which is to check various conditions prior to moving on to the next part of my program.

I have two objects, a list of product descriptions, which can be created as follows:

test_products <- list(c("dingdong","small","affordable","polished"),c("wingding","medium","cheap","dull"),c("doodad","big","expensive","shiny"))

And a data frame of combinations of features that are not allowed, where each row represents a disallowed combination of features. A sample data frame can be created as follows:

disallowed <- data.frame(trait1 = c("dingdong","wingding","doodad"),
trait2 = c("medium","big","big"),
stringsAsFactors = FALSE)

My goal is to check each product against each of the disallowed combinations as efficiently as possible. So far I can check one product against all prohibitions as follows (in this case, the third product):

apply(disallowed, 1, function(x) x %in% unlist(test_products[[3]]))

OR I can check all products against one of the disallowed combinations of traits (the third combination).

lapply(test_products, function(x) disallowed[3,] %in% x)

Is it possible to check all products against all rows of the data frame of disallowed feature combination, without using a loop?

My end result should look something like this:

Product 1: OK
Product 2: OK
Product 3: NOT OK

Since Product 3 runs afoul of the third disallowed row.

ycw ycw
Answer Source

There are definitely more elegant ways, but I am going to share my thoughts on this.

First, the way you created the disallowed data frame is convoluted. I decided to use the following code to create disallowed.

# Create a data frame showing disallowed traits
disallowed <- data.frame(trait1 = c("dingdong","wingding","doodad"), 
                         trait2 = c("medium","big","big"),
                         stringsAsFactors = FALSE)

I then created a function called violate, which has two arguments. The first argument product is a vector of character. The second argument, check_df, is the data frame contains disallowed traits.

The output of violate is a logical vector. TRUE means all two traits from the check_df of the row are both TRUE.

# Create the violate function
violate <- function(product, check_df){
  temp_df <-, function(Col) Col %in% product))
  temp_vec <- apply(temp_df, 1, function(Row) sum(Row) == 2)

# Test the violate function
violate(test_products[[3]], check_df = disallowed)

After that, I applied the violate function using sapply through the test_products list. The results from violate were evaluated to see if all disallowed checks are FALSE

# Apply the violate function and check if all results from violate is FALSE
sapply(test_products, function(product){
  sum(violate(product, check_df = disallowed)) == 0})

As you can see, the third element of the results is FALSE, indicating that the third product is not OK, while product 1 and product 2 are OK because the final results from sapply are both TRUE.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download