user88911 user88911 - 23 days ago 7
R Question

How to export filtered data.frame into desired folder dynamically?

I've read list of csv files in the folder, now I intend to filter them out given threshold, where dropped rows in each data.frame must be exported to desired folder in dynamic way while saved rows returned as an output. However, I implemented the function to do this task, and it works fine except writing dropped rows as csv files into desired folder is failed. Can anyone point me out what's going in my functions ? Any efficient way to write data.frame in specific folder dynamically ? How can I correct implementation ? Any idea ?

reproducible data:

myData <- list(
df_1 = data.frame( L1=seq(3, by=4, len=16), L2=seq(7, by=4, len=16), score=sample(30, 16)),
df_2 = data.frame( L1=seq(6, by=7, len=20), L2=seq(14, by=7, len=20), score=sample(30, 20)),
df_3 = data.frame( L1=seq(11, by=8, len=25), L2=seq(19, by=8, len=25), score=sample(30, 25))
)


I implement this function, it works fine except writing csv files is not desired:

func <- function(mlist, threshold=NULL, outDir=getwd(), .fileName=NULL, ...) {
if(!dir.exists(outDir)) {
dir.create(file.path(outDir))
setwd(file.path(outDir))
}
rslt <- lapply(mlist, function(x) {
.drop <- x[x$score < threshold,]
# FIXME : write droped rows of each data.frame into specific folder
write.csv(.drop, sprintf("drop.%s.csv", x), row.names = FALSE)
.save <- x[x$score >= threshold,]
return(.save)
})
return(rslt)
}


This is what I intend to write csv file in specific location: concatenate with .initPath
.initPath = getwd()
, create new folder and write csv files there. I don't understand what went wrong in my implementation, I got an error.

How can I write dropped rows from each data.frame into specific folder dynamically ? Is there any quick way to make this happen more efficiently ? Thanks a lot.

Answer

Currently, in your write.csv() line, you are concatenating the dataframe object, x, into the file name with sprintf(). You need to concatenate the name of the dataframe object to the file name.

So, consider replacing your lapply() with a Map() function (and Map being a wrapper to mapply(func, x, y, SIMPLIFY=FALSE) where you pass two arguments for mlist itself and mlist names. Do note: you might think that using names(x) in original setup would work but this returns the column names of corresponding dataframe which still will fail in concatenating to a filename string.

func <- function(mlist, threshold=NULL, outDir=getwd(), .fileName=NULL, ...) {
  if(!dir.exists(outDir)) {
    dir.create(file.path(outDir))
    setwd(file.path(outDir))
  }
  rslt <- Map(function(x, y) {
    .drop <- x[x$score < threshold,]

    write.csv(.drop, sprintf("drop.%s.csv", y), row.names = FALSE)
    .save <- x[x$score >= threshold,]
    return(.save)
  }, mlist, names(mlist))
  return(rslt)
}

# EXAMPLE
newData <- func(myData, threshold=10)

And if you do want to keep lapply(), create temp variables to capture df object and df name. Also below shows how to allow dynamic path and file name changes by passing such values into args and concatenating all with sprintf():

func <- function(mlist, threshold=NULL, csvName="", outDir=getwd(), .fileName=NULL, ...) {
  if(!dir.exists(outDir)) {
    dir.create(file.path(outDir))
    setwd(file.path(outDir))
  }
  rslt <- lapply(seq_along(mlist), function(x) {
    df <- mlist[[x]]; dfname <- names(mlist)[x]
    .drop <- df[df$score < threshold,]

    write.csv(.drop, sprintf("%s/%s.%s.csv", outDir, csvName, dfname), row.names = FALSE)
    .save <- df[df$score >= threshold,]
    return(.save)
  })
  return(rslt)
}

# EXAMPLE
newData <- func(myData, threshold=10, csvName=usercsv, outDir=userpath)
Comments