Masi Masi - 1 month ago 5
R Question

Why this R CSV column selection fails?

I can use CSV file which has one column but not one column of many column file

dataA = read.csv("data.csv", header = FALSE,sep = ",")
summary(dataA) # works!


Output: correct basic statistical analysis of values (min,1st Qu, ...).
Now, multicolumn data where I want to use only the second column, so I do the following

ID,Age,Gender
1,2,3
4,5,6


Code where
dataA[-(1), 2]
says remove header and take the second column

dataA = read.csv("data.csv", header = FALSE,sep = ",")
dataA = dataA[-(1), 2]
summary(dataA) # does not work!:


Output: list of values in a list, no statistical analysis; it seems the output is like a string or something; here an example for bigger data set

male 5 27.78
23 24 32 39 43 47 51 53 54 56 57 59 61 62 63 64 65 66 68
2 2 2 2 1 1 1 2 1 1 1 1 1 1 1 2 2 1 1 2
69 72 73 75 76 77 80 81 83 84 87 89 Age
3 2 2 1 1 1 1 1 2 2 2 1 0


Expected output like this

V1
Min. :23.00
1st Qu.:50.75
Median :65.00
Mean :58.33
3rd Qu.:68.75
Max. :81.00


OS: Debian 8.5

R: 3.1.1

Answer

Your multicolumn file has a header, so just do:

dataA = read.csv("data.csv", header = TRUE, sep = ",")
dataA = dataA[, 2]
summary(dataA)
Comments