Prradep Prradep - 2 months ago 9
R Question

error in reading a csv file

I have been facing an error while reading a

csv
file. first few lines of the line is as given below:

"","1.CEL","2.CEL","3.CEL","4.CEL"
"1_s_at",NA,NA,NA,NA
"2_at",NA,NA,NA,NA
"3_at",NA,NA,NA,NA
"4_at",NA,NA,NA,NA
"5_g_at",NA,NA,NA,NA
"6_at",NA,NA,NA,NA
"7_at",NA,NA,NA,NA


reading the csv.file

test <- read.csv(file='/home/userxyz/test.csv')
head(test)
# X X1.CEL X2.CEL X3.CEL X4.CEL
#1 1_s_at NA NA NA NA
#2 2_at NA NA NA NA
#3 3_at NA NA NA NA
#4 4_at NA NA NA NA
#5 5_g_at NA NA NA NA
#6 6_at NA NA NA NA


Explicitly specifying the presence of the header.

test <- read.csv(file='/home/userxyz/test.file', header=T)
head(test)
# X X1.CEL X2.CEL X3.CEL X4.CEL
#1 1_s_at NA NA NA NA
#2 2_at NA NA NA NA
#3 3_at NA NA NA NA
#4 4_at NA NA NA NA
#5 5_g_at NA NA NA NA
#6 6_at NA NA NA NA


While explicitly specifying the row.names, it didn't work.

test <- read.csv(file='/home/userxyz/test.file', row.names=T)
#Error in read.table(file = file, header = header, sep = sep, quote = quote, :
# invalid 'row.names' specification


read.table
,
read.delim
functions have also been looked at.

Is the error because of special characters in the
row.names
?

Answer

I think you are trying to read in the first column as row name. Try:

x <- '"","1.CEL","2.CEL","3.CEL","4.CEL"
"1_s_at",NA,NA,NA,NA
"2_at",NA,NA,NA,NA
"3_at",NA,NA,NA,NA
"4_at",NA,NA,NA,NA
"5_g_at",NA,NA,NA,NA
"6_at",NA,NA,NA,NA
"7_at",NA,NA,NA,NA'

read.csv(text = x, row.names = 1L)

#       X1.CEL X2.CEL X3.CEL X4.CEL
#1_s_at     NA     NA     NA     NA
#2_at       NA     NA     NA     NA
#3_at       NA     NA     NA     NA
#4_at       NA     NA     NA     NA
#5_g_at     NA     NA     NA     NA
#6_at       NA     NA     NA     NA
#7_at       NA     NA     NA     NA

If you want to preserve exactly the header, do

read.csv(text = x, row.names = 1L, check.names = FALSE)

#       1.CEL 2.CEL 3.CEL 4.CEL
#1_s_at    NA    NA    NA    NA
#2_at      NA    NA    NA    NA
#3_at      NA    NA    NA    NA
#4_at      NA    NA    NA    NA
#5_g_at    NA    NA    NA    NA
#6_at      NA    NA    NA    NA
#7_at      NA    NA    NA    NA

Regarding row.name, read ?read.csv:

row.names: a vector of row names.  This can be a vector giving the
           actual row names, or a single number giving the column of the
           table which contains the row names, or character string
           giving the name of the table column containing the row names.
Comments