user3354212 user3354212 - 3 months ago 10
R Question

How to concatenate columns and column names for each row, excluding a specific value in r

I have a dataframe:

df = read.table(text="X1 X2 X3 X4 X5 X6 X7
C U C D B C C
D C B A C D U
D C B A C D D
C D U U B C D
C D B D C U C
D C C A B C D
U D C U U C C", header=T, stringsAsFactors=F)


I would like to concatenate all columns and their column names for each row separately but the columns with "U" would be excluded. to find out which rows and columns have "U", use

which(df == "U", arr.ind=TRUE)


the result is expected as:

output = read.table(text="'X1 X3 X4 X5 X6 X7' 'C C D B C C'
'X1 X2 X3 X4 X5 X6' 'D C B A C D'
'X1 X2 X3 X4 X5 X6 X7' 'D C B A C D D'
'X1 X2 X5 X6 X7' 'C D B C D'
'X1 X2 X3 X4 X5 X7' 'C D B D C C'
'X1 X2 X3 X4 X5 X6 X7' 'D C C A B C D'
'X2 X3 X6 X7' 'D C C C'", header=F, stringsAsFactors=F)


I don't know how to get the expected result without using a loop. Thanks.

Answer

One easier option would be apply with MARGIN = 1

t(apply(df, 1, function(x) {
            i1 <- x!="U"
            c(V1=paste(names(x)[i1], collapse=" "),
              V2= paste(x[i1], collapse=" ")) }))

To get the values alone, another option is paste and then do the gsub

trimws(gsub("\\s*U", "", do.call(paste, df)))

Or as @RHertel mentioned

gsub("\\sU|U\\s","",do.call(paste,df)) 
Comments