Abhinav S Abhinav S - 3 months ago 25
R Question

`rbind` unique entries of all columns of a data frame and write it to a csv file

##Initialise empty dataframe
g <-data.frame(x= character(), y= character(),z=numeric())

## Loop through each columns and list out unique values (with the column name)
for(i in 1:ncol(iris))
{
a<-data.frame(colnames(iris)[i],unique(iris[,i]),i)
g<-rbind(g,a)
setNames(g,c('x','y','z'))
}
## write the output to csv file
write.csv(g,"1.csv")


The output CSV file is something like this

enter image description here

Now the Column headers I want are not proper. I want column headers to be 'x','y','z' respectively. Also the first column should not be there.

Also if you have any other efficient way to do this, let me know. Thanks!

Answer

This will do the work:

for(i in 1:ncol(iris))
{
a<-data.frame(colnames(iris)[i],unique(iris[,i]),i)
g<-rbind(g,a)
}
g <- setNames(g,c('x','y','z'))   ## note the `g <-`
write.csv(g, file="1.csv", row.names = FALSE)   ## don't write row names

setNames returns a new data frame with names "x", "y" and "z", rather than updating the input data frame g. You need the explicit assignment <- to do the "replacement". You may hide such <- by using either of the two

names(g) <- c('x','y','z')
colnames(g) <- c('x','y','z')

Alternatively, you can use the col.names argument inside write.table:

for(i in 1:ncol(iris))
{
a<-data.frame(colnames(iris)[i],unique(iris[,i]),i)
g<-rbind(g,a)
}
write.table(g, file="a.csv", col.names=c("x","y","z"), sep =",", row.names=FALSE)

write.csv() does not support col.names, hence we use write.table(..., sep = ","). Trying to use col.names in write.csv will generate a warning.


A more efficient way

I would avoid using rbind inside a loop. I would do:

x <- lapply(iris, function (column) as.character(unique(column)))
g <- cbind.data.frame(stack(x), rep.int(1:ncol(iris), lengths(x)))
write.table(g, file="1.csv", row.names=FALSE, col.names=c("x","y","z"), sep=",")

Read ?lapply and ?stack for more.