user3354212 user3354212 - 1 year ago 79
R Question

How to concatenate columns and column names for each row, excluding a specific value in r

I have a dataframe:

df = read.table(text="X1 X2 X3 X4 X5 X6 X7
C U C D B C C
D C B A C D U
D C B A C D D
C D U U B C D
C D B D C U C
D C C A B C D
U D C U U C C", header=T, stringsAsFactors=F)


I would like to concatenate all columns and their column names for each row separately but the columns with "U" would be excluded. to find out which rows and columns have "U", use

which(df == "U", arr.ind=TRUE)


the result is expected as:

output = read.table(text="'X1 X3 X4 X5 X6 X7' 'C C D B C C'
'X1 X2 X3 X4 X5 X6' 'D C B A C D'
'X1 X2 X3 X4 X5 X6 X7' 'D C B A C D D'
'X1 X2 X5 X6 X7' 'C D B C D'
'X1 X2 X3 X4 X5 X7' 'C D B D C C'
'X1 X2 X3 X4 X5 X6 X7' 'D C C A B C D'
'X2 X3 X6 X7' 'D C C C'", header=F, stringsAsFactors=F)


I don't know how to get the expected result without using a loop. Thanks.

Answer Source

One easier option would be apply with MARGIN = 1

t(apply(df, 1, function(x) {
            i1 <- x!="U"
            c(V1=paste(names(x)[i1], collapse=" "),
              V2= paste(x[i1], collapse=" ")) }))

To get the values alone, another option is paste and then do the gsub

trimws(gsub("\\s*U", "", do.call(paste, df)))

Or as @RHertel mentioned

gsub("\\sU|U\\s","",do.call(paste,df))