user88911 user88911 - 12 days ago 5
R Question

unexpected error when manipulating list of data.frame

I have list of data.frame as an output of custom function, so I intend to split each data.frame by its last column, where threshold is given. However, I manipulated the two list nicely, and combined them to get only one table. But I have an error when manipulating this new table. I can't figure out where is issue come from. How can I fix this error ? Can anyone point me out to possibly fix this error ? If this error can be fixed, I want to implement wrapper. How can I easily manipulate list of data.frame ? Any better idea to debug the error ?

mini example :

savedDF <- list(
bar = data.frame(.start=c(12,21,37), .stop=c(14,29,45), .score=c(5,9,4)),
cat = data.frame(.start=c(18,42,18,42,81), .stop=c(27,46,27,46,114), .score=c(10,5,10,5,34)),
foo = data.frame(.start=c(3,3,33,3,33,91), .stop=c(24,24,10,24,10,17), .score=c(22,22,6,22,6,7))
)

discardedDF <- list(
bar = data.frame(.start=c(16), .stop=c(20), .score=c(2)),
cat = data.frame(.start=c(21), .stop=c(23), .score=c(1)),
foo = data.frame(.start=c(54), .stop=c(71), .score=c(3))
)


I can manipulate this way :

both <- do.call("rbind", c(savedDF, discardedDF))
cn <- c("letter", "seq")
# FIXME :
DF <- cbind(
read.table(text = chartr("_", ".", rownames(both)), header=T, sep = ".", col.names = cn),
both)
DF <- transform(DF, isPassed = ifelse(.score > 8, "Pass", "Fail"))

by(DF, DF[c("letter", "isPassed")],
function(x) write.csv(x[-(1:length(savedDF))],
sprintf("%s_%s_%s.csv", x$letter[1], x$isPassed[1])))


But I have an error

Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
line 15 did not have 2 elements


Why I have this error ? Can anyone point me out how to fix this ?

my desired output is list of CSV file as follows :

bar.saved.Pass.csv
bar.saved.Fail.csv
bar.discarded.Pass.csv
bar.discarded.Fail.csv

cat.saved.Pass.csv
cat.saved.Fail.csv
cat.discarded.Pass.csv
cat.discarded.Fail.csv

foo.saved.Pass.csv
foo.saved.Fail.csv
foo.discarded.Pass.csv
foo.discarded.Fail.csv


But I think controlling exported CSV files still not desired. How can I improve functionality of this wrapper ? I intend to let use choose output directory by custom, or more dynamic would be nice. Any idea ? Thanks a lot

Answer

Is this what you are looking for?

library(tidyverse)
library(magrittr)

both <- do.call("rbind", c(savedDF, discardedDF))
both %<>% rownames_to_column(var = "cn")
both %<>% separate(cn, c("letters", "seq"), sep = "\\.")
both %<>% mutate(isPassed = ifelse(.score > 8, "Passed", "Failed"),
                 isDiscard = ifelse(is.na(seq), "Saved", "Discarded"))

list_of_dfs <- both %>% split(list(.$letters, .$isPassed, .$isDiscard))
csv_names <- paste0("/Users/nathanday/Desktop/", names(list_of_dfs), ".csv") # change this path
mapply(write.csv, list_of_dfs, csv_names)

The %<>% operator is short hand so both %<>% rownames_to_columm(var = "cn") is identical to both <- rownames_to_column(both, var = "cn")

To make it more "dynamic" for allowing output path input, you could wrap this in the function structure you already have like this:

output_where <- function(output_path, list1, list2) {
    if (!dir.exists(output_path)) {
        dir.create(file.path(output_path))
    }
    both <- do.call(rbind, c(list1, list2))
    both %<>% rownames_to_column(var = "cn")
    both %<>% separate(cn, c("letters", "seq"), sep = "\\.")
    both %<>% mutate(isPassed = ifelse(.score > 8, "Passed", "Failed"), isDiscard = ifelse(is.na(seq), "Saved", "Discarded"))

    list_of_dfs <- both %>% split(list(.$letters, .$isPassed, .$isDiscard))
    csv_names <- paste0(output_path, names(list_of_dfs), ".csv")
    return(mapply(write.csv, list_of_dfs, csv_names))
}

output_where("~/Desktop/", savedDF, discardedDF)
Comments