Florian Richard Florian Richard - 3 months ago 7
R Question

r - How to make this loop faster?

I am reading the

.csv
file named
cleanequityreturns.csv
which looks like this:

enter image description here

It goes from
r1
to
r299
and has 4,166 rows. The following code then creates a new file for each column, compute the approximate entropy using the
approx_entropy
function, and prints the value. I know creating a new file for each column is very tedious but I could not find another to do it.

equityreturn <- read.csv("cleanequityreturns.csv", header=T)
for(i in 1:299) {
file2 = paste(i, "equityret.csv", sep="")
file5 = paste("r", i, sep="")
file1 = subset(equityreturn, select=file5)
write.table(file1, file2, sep="\t", row.names=FALSE, col.names=FALSE)
file3 = paste("equity", i, sep="")
file3 = matrix(scan(file = file2), nrow=4166, byrow=TRUE)
print(approx_entropy(file3, edim = 4, r=0.441*sd(file3), elag = 1))
}


My problem is the following: it takes a long time for the code to perform these tasks. I tried running it for 10 columns and it took about 20 min, which translates in about 10h for all of the 299 columns. Also, this code prints each approximate entropy values, so I still have to copy and paste them in Excel to use them.

How could I make this code run faster and write the output in a
.csv
file?

Answer

Simply use lapply() as running a dataframe through it processes columns iteratively:

equityreturn <- read.csv("cleanequityreturns.csv", header=T)

entropy_values <- lapply(equityreturn, function(col) {
        approx_entropy(col, edim = 4, r = 0.441*sd(col), elag = 1)
})
Comments