Dimon D. Dimon D. - 2 months ago 26
R Question

the fastest way to write csv file by write.csv.raw (iotools package)

I am experimenting with different packages to find the best suit to save data files such as csv ones fast.

I have found 'iotools' package and the method 'write.csv.raw' that is pretty good to save data concerning the time lapsed.

However the dataset in the file saved has some controversial features:


  • no column names;

  • double/float numbers are with decimal sign "." but not with "," .



So I need to have dataset in the file saved to be with column names and the correct decimal sign.

My script as follows:

library(iotools)
library(UsingR)

data(galton)
head(galton)
#option1 to save data
write.csv.raw(galton,"test.csv",append=FALSE,sep=";",col.names=TRUE)
#option2 to save data
write.table.raw(galton,"test.csv",append=FALSE,sep=";",col.names=TRUE)
read.csv2("test.csv",nrow=5)


the input dataset (from R):

child parent
61.7 70.5
61.7 68.5
61.7 65.5
61.7 64.5
61.7 64.0
62.2 67.5


the output file:

X1.61.7 X70.5
2\t61.7 68.5
3\t61.7 65.5
4\t61.7 64.5
5\t61.7 64
6\t62.2 67.5


Update of 18/02/16:

with help of the answer by procrastinator0 I have managed to use 'write.csv.raw' in correct manner.

The comparison of different write-methods based upon the dataframe from the question section as follows:


system.time(write.csv.raw(n,"test.csv",sep=";",append=TRUE))

user system elapsed

15.61 1.17 21.92

system.time(write.table(n,"test.csv",sep=";",row.names=FALSE,dec=","))

user system elapsed

63.25 1.20 64.60

system.time(write.csv2(n,"test.csv",row.names=FALSE))

user system elapsed
63.71 1.28 65.38

system.time(write_csv(n, "test.csv", na = "NA"))
user system elapsed

136.75 3.60 141.24


Update of 27/04/16:

I have done some experiment runs to write/read data (different tools). Experiments are based on the theoretical sample as well as the real one (from my practice). I have tried to make reproducible scripts. Hope they will be useful for newcomers :-)

Links to IO experiments:

Reading data from files: https://rpubs.com/demydd/166375

Writing data to files: https://rpubs.com/demydd/170957

Update of 19/09/16:

feather package is added (read_feather, write_feather)
fwrite is added from data.table package.

links to updated tests:

to read

to write

Answer

For column names, this is a known issue. Suggested workaround:

> cat(noquote(paste0(paste0(names(df),collapse = ","),"\n")),file = "output.csv")
> write.csv.raw(df,"output.csv",append=TRUE)

write.csv.raw does not index with "\t" for me by default, but you could try using NA for the nsep argument.