Donkeykongy Donkeykongy - 3 months ago 10
R Question

Repeating the same command for x number of times

I am trying to repeat a same command for x number of times, a simple example would be to read files with same names but different years for 10 times, I can do this

yr2001detail<-read.csv("E:/yr2001detail.csv",stringsAsFactors = FALSE,header=TRUE )
yr2002detail<-read.csv("E:/yr2002detail.csv",stringsAsFactors = FALSE,header=TRUE )
yr2003detail<-read.csv("E:/yr2003detail.csv",stringsAsFactors = FALSE,header=TRUE )
yr2004detail<-read.csv("E:/yr2004detail.csv",stringsAsFactors = FALSE,header=TRUE )
yr2005detail<-read.csv("E:/yr2005detail.csv",stringsAsFactors = FALSE,header=TRUE )
yr2006detail<-read.csv("E:/yr2006detail.csv",stringsAsFactors = FALSE,header=TRUE )
yr2007detail<-read.csv("E:/yr2007detail.csv",stringsAsFactors = FALSE,header=TRUE )
yr2008detail<-read.csv("E:/yr2008detail.csv",stringsAsFactors = FALSE,header=TRUE )
yr2009detail<-read.csv("E:/yr2009detail.csv",stringsAsFactors = FALSE,header=TRUE )
yr2010detail<-read.csv("E:/yr2010detail.csv",stringsAsFactors = FALSE,header=TRUE )


which is bad, because i'm repeating myself and also it is really time consuming if there are way too many files or if i have to repeat too many times. I have tried exploring doing

for(i in 1:10){
paste("yr",2000+i,"detail",sep="")<-read.csv(paste("E:/yr",2000+i,"detail.csv",sep=""),stringsAsFactors = FALSE,header=TRUE )
}


which didnt work because of the left side, and also this

vector <- rep(NA,10)
for(i in 1:10){
vector[i] <- paste("yr",2000+i,"detail",sep="")
}
for(i in 1:10){
vector[i]<-read.csv(paste("E:/yr",2000+i,"detail.csv",sep=""),stringsAsFactors = FALSE,header=TRUE )
}


I am asking as further down along the way, i'll have to deal with my data yearly which means assigning more repetitive commands for each year.

Answer

We can use sprintf to create the 'files' and 'filenames'

files <- sprintf("E:/yr%ddetail.csv", 2001:2010)
filenames <- sprintf("yr%ddetail", 2001:2010)

Or even paste can be used

files <- paste0("E:/", 2001:2010, "detail.csv")
filenames <- paste0("yr", 2001:2010, "detail")

and then loop through the files to read it. If we need separate objects, use assign,

for(j in seq_along(filenames)){
    assign(filenames[j], read.csv(files[j], stringsAsFactors=FALSE, header=TRUE))
}

However, it is better to read it in a list rather than having many objects in the global environment, i.e.

lst <- setNames(lapply(files, read.csv, stringsAsFactors=FALSE, header=TRUE), filenames)

Or a faster option with fread

library(data.table)
lst <- setNames(lapply(files, fread), filenames)

After reading it in a list, we can also rbind the datasets together to a single one and have an 'id' column to indicate from which file it came from. This can be useful in several operations.

dt <- rbindlist(lst, idcol="Grp")
Comments