nik nik - 2 months ago 6
R Question

how to remove duplicated strings and merge all columns strings in one?

I have a data looks like the following df

df<- structure(list(V1 = structure(c(5L, 1L, 2L, 3L, 4L), .Label = c("DNAJC11;FGOTG",
"MAPK14", "PPIB", "RBX1", "USP14"), class = "factor"), V2 = structure(c(4L,
3L, 2L, 1L, 1L), .Label = c("", "DNAJC9", "MAPK14", "USP14"), class = "factor"),
V3 = structure(c(3L, 2L, 4L, 5L, 1L), .Label = c("", "DNAJC11;FGOTG",
"GCLC", "GSR", "STIP1"), class = "factor")), .Names = c("V1",
"V2", "V3"), class = "data.frame", row.names = c(NA, -5L))


I want to merge all columns into one and then keep the unique ones
for example the output should look like this

USP14
DNAJC11;FGOTG
MAPK14
PPIB
RBX1
DNAJC9
GCLC
GSR
STIP1


I tried to use
melt
function but I could not figure out how to do this, any comment is appreciated. Thanks

Answer
unique(as.vector(as.matrix(df)))

To remove the entries with no characters:

vec<-unique(as.vector(as.matrix(df)))
vec[-which(vec=="")]

or, courtesy @rawr

Filter(nzchar, unique(as.vector(as.matrix(df))))