Bipero -4 years ago 97

R Question

I have the following data frame, which is going to be used as an input in a logit regression:

`my_frame<-data.frame(y=c(1,0,1),A=c(0,1,1),B=c(1,0,0),C=c(0,0,0),t=c(1,1,1),x=c(1,0,0),z=c(1,0,1))`

Knowing that the dummy variables A, B and C are connected through a linear equation (A+B+C=1), I need to drop one of the three before proceeding.

`y A B C t x z`

1 0 1 0 1 1 1

0 1 0 0 1 0 0

1 1 0 0 1 0 1

Now, here is the difficult part. I want to exclude randomly one of the columns of a group comprised by A,B,C and D, but not the one that has 1 as a value in the last row of the dataframe.

In my example, I want one of B and C to be excluded

Recommended for you: Get network issues from **WhatsUp Gold**. **Not end users.**

Answer Source

I don't really get, what you mean with your last sentence about column D, but anyway, you could try this:

```
my_frame<-data.frame(y=c(1,0,1),A=c(0,1,1),B=c(1,0,0),C=c(0,0,0),t=c(1,1,1),x=c(1,0,0),z=c(1,0,1))
allRelevantCols <- c("A", "B", "C")
# Get all columns, which can be excluded
allColsToExclude <- allRelevantCols[which(my_frame[nrow(my_frame), allRelevantCols] == 0)]
for (i in 1:<how often you would like to run this>) {
colsToExclude <- c(sample(allColsToExclude, 1))
my_frame[, -which(colnames(my_frame) %in% colsToExclude)]
}
```

Recommended from our users: **Dynamic Network Monitoring from WhatsUp Gold from IPSwitch**. ** Free Download**

Latest added