jackson883 jackson883 - 1 month ago 6
R Question

how to select subset only by [] in r?

a<-data.frame(q1=rep(c(1,'A','B'),4),q2=c(1,'A','B','C'),w1=c(1,'A','B','C'))


I want to convert the element of
q1,q2
which
!=1
to
0
,and I want to use only
[]
.I believe all the subset can be done by [].

a[grep("q\\d",colnames(a),perl=TRUE)!=1,grep("q\\d",colnames(a),perl=TRUE)]<-0


but it doesn't work, what's the problem?

Answer

We create the a numeric index of the column names that start with 'q' followed by numbers ('nm1'), use that to subset the columns in 'a' and assign the values that are not equal to 1 in that subset to 0.

nm1 <- grep("q\\d+", names(a))
a[nm1][a[nm1] != 1] <- 0

and make sure we have the columns as character class by using stringsAsFactors= FALSE in the data.frame

The above replacement is based on a logical matrix (a[nm1]!=1) which may create memory problems if the dataset is really big. In that case, it is better to loop through the columns and replace with 0

a[nm1] <- lapply(a[nm1], function(x) replace(x, x!=1, 0))

data

a <- data.frame(q1=rep(c(1,'A','B'),4),q2=c(1,'A','B','C'),
                 w1=c(1,'A','B','C'), stringsAsFactors=FALSE)