Zaire Zaire -4 years ago 109
R Question

Reading only one column with leading 0 and retainng it in a csv file with over 300 columns

I have a large CSV file a preview of which is shown here:

ID,NUMBER,RLNUMBER,START_DATE,ID1,ID2,....................................................,ID305
1,0100000109,623,2012-01-01,TT,06,........................................................,ADD
2,200000109,515,2013-09-23,FF,009,........................................................,BCC
3,0600000109,611,2014-11-15,HH,90,..........................................................,DGG


As you can see, the column
NUMBER
has some values with leading '0' and some values without leading '0'. Similarly for the column ID2.

My requirement is that I have to move the contents of this CSV file to another CSV file. The contents of the OUTPUT CSV file should look something like this:

ID,NUMBER,RLNUMBER,START_DATE,ID1,ID2,....................................................,ID305
1,0100000109,623,2012-01-01,TT,6,........................................................,ADD
2,200000109,515,2013-09-23,FF,9,........................................................,BCC
3,0600000109,611,2014-11-15,HH,90,...........................................................DGG


Notice that the values of column
NUMBER
are retained in the output CSV file along with their leading '0', while all values in column ID2 have had their leading '0' stripped.

For this, I only need to read the column
NUMBER
and only that column as vector type 'character' into a dataframe and then write the dataframe into the output CSV file (I think).

I know that using

data_frame<-read.csv("filename",Colclasses = c("integer","character","integer"......)


I can specify vector types for each column while reading the input CSV file. But doing this for more than 300 columns is very difficult. So is there any other way to do this?

I'm very new to Rscript (just started today) and any help would be greatly appreciated.

Answer Source

You could try (since, as far as I understood, you are only interested in the number column):

data_frame <- read.csv("filename", colClasses=c("NUMBER" = "character"))
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download