HiThere HiThere - 4 months ago 9
R Question

How to save a CSV file with R with line breaks that Notepad will recognize?

I'm sorry to bother you with probably an encoding question. Spending couple of hours without getting the solution I decided to post it here.
I'm trying to write a simple table unsuccessfully using

write.table
,
write.csv
,
write.csv2
from Ubuntu 14.04. My data is kind of messy resulting from a cronjob:

ID <- c("",30,26,20,30,40,5,10,4)
b <- c("",2233,12,2,22,13,23,23,100)
c <- c("","","","","","","","","")
d <- c("","","","","","","","","")
e <- c("","","","","","800","","","")
f <- c("","","","","","","","","")
g <- c("","","","","","","","EA","")
h <- c("","","","","","","","","")
df <- data.frame(ID,b,c,d,e,f,g,h)

# change columns to chr
for(i in c(1,2:ncol(df))) {
df[,i] <- as.character(df[,i])
}

str(df)

# data.frame': 9 obs. of 8 variables:
# $ ID: chr "" "30" "26" "20" ...
# $ b : chr "" "2233" "12" "2" ...
# $ c : chr "" "" "" "" ...
# $ d : chr "" "" "" "" ...
# $ e : chr "" "" "" "" ...
# $ f : chr "" "" "" "" ...
# $ g : chr "" "" "" "" ...
# $ h : chr "" "" "" "" ...


head(df,n=9)

ID b c d e f g h
# 1
# 2 30 2233
# 3 26 12
# 4 20 2
# 5 30 22
# 6 40 13 800
# 7 5 23
# 8 10 23 EA
# 9 4 100


I have tried different combinations and suggestions found on SO, however nothing worked. The result is always somehow displaced instead of long its wide. In the current example ist just one long row.

I tried:

write.table(df,"df.csv",row.names = FALSE, dec=".",sep=";")
write.table(df,"df.csv",row.names = FALSE,dec=".",sep=";", col.names = T)
write.table(df,"df.csv",row.names = FALSE,sep=";",fileEncoding = "UTF-8")
write.table(df,"df.csv",row.names = FALSE,fileEncoding = "UTF-8")

R version 3.3.1 (2016-06-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.4 LTS

locale:
[1] LC_CTYPE=de_DE.UTF-8 LC_NUMERIC=C LC_TIME=de_DE.UTF-8
[4] LC_COLLATE=de_DE.UTF-8 LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=de_DE.UTF-8
[7] LC_PAPER=de_DE.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] dplyr_0.4.3 DBI_0.4-1 RGA_0.4.2 RMySQL_0.11-3

loaded via a namespace (and not attached):
[1] Rcpp_0.12.5 lubridate_1.5.6 digest_0.6.9 assertthat_0.1 R6_2.1.2
[6] plyr_1.8.3 jsonlite_1.0 magrittr_1.5 httr_1.1.0 stringi_1.1.1
[11] curl_0.9.7 tools_3.3.1 stringr_1.0.0 parallel_3.3.1


Wrong output as pic:

enter image description here

Correct output results from the same data on :
enter image description here

R version 3.3.1 (2016-06-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252
[3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
[5] LC_TIME=German_Germany.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base
[![enter image description here][2]][2]

Answer

The problem isn't R or Ubuntu it is notepad. Specifically, it expects "\r\n" for line breaks whereas most other text readers are happy with "\n" which is the default line break used by write.xxx.

If you add the parameter eol="\r\n" then you should be able to open in Notepad and see the expected line breaks.

For instance:

write.table(df,"df.csv",row.names = FALSE, dec=".",sep=";",eol="\r\n")
Comments