user3127034 user3127034 - 9 months ago 51
R Question

Return a data frame from function

I have the following code inside a function

Myfunc<- function(directory, MyFiles, id = 1:332) {
# uncomment the 3 lines below for testing
#id=c(2, 4)

df2 <- data.frame()

for(i in 1:length(idd)) {
EmptyVector <- read.csv(MyFiles[i])

This works when I try to run it in R by selecting the code inside the function and commenting out the return. I get a nice data frame like from the print statement:

> df2
id ret2
1 2 994
2 4 7112

However, when I try to return the dataframe
from the function it only returns the 1st row, ignoring all other values. My problem is that it works within the function for various values I have tried (opening multiple files with various combinations) and not when I try to return the data frame. Can someone help please. Thanks a lot in advance.

Answer Source

If I understand you correctly, you are trying to create a dataframe with the number of complete cases for each id. Supposing your files are names with the id-numbers like you specified (e.g. f2.csv), you can simplify your function as follows:

myfunc <- function(directory, id = 1:332) {
  y <- vector()
  for(i in 1:length(id)){
    x <- id
    y <- c(y, sum(complete.cases(
  df <- data.frame(x, y)
  colnames(df) <- c("id","ret2")

You can call this function like this:


An explanation of the above code. You have to break down your problem into steps:

  1. You need a vector of the id's, that's done by x <- id
  2. For each id you want the number of complete cases. In order to get that, you have to read the file first. That's done by read.csv(as.character(paste0(directory,"/","f",id[i],".csv"))). To get the number of complete cases for that file, you have to wrap the read.csv code inside sum and complete.cases.
  3. Now you want to add that number to a vector. Therefore you need an empty vector (y <- vector()) to which you can add the number of complete cases from step 2. That's done by wrapping the code from step 2 inside y <- c(y, "code step 2"). With this you add the number of complete cases for each id to the vector y.
  4. The final step is to combine these two vectors into a dataframe with df <- data.frame(x, y) and assign some meaningfull colnames.

By including the steps 1, 2 and 3 (except the y <- vector() part) in a for-loop, you can iterate over the list of specified id's. Creating the empty vector with y <- vector() has to be done before the for-loop, so that the for-loop can add values to y.