Titan552 Titan552 - 2 months ago 5
R Question

Write a function that returns a vector or list of three statistics

This is a question for school, but I have been working on it for hours and just need a point in the right direction. I am not asking for the full answer.

I was given a data frame with student grades for various assessments. I have to write a function that will result in a vector or list that will give the min, max, and average of one particular assessment.

I was provided with the following framework:

checkAssessment <- function(df, assessmentName)
{

}


I need to be able to write the code to get the exact results below when the following line of code is executed:

checkAssessment(df,"hw1")
# $min
# [1] 0
#
# $max
# [1] 14
#
# $avg
# [1] 12.58824


So, I have tried many ways to go about this, none of which have worked. The two that came closest were

checkAssessment <- function(df, assessmentName)
{
my_min <- df$assessmentName == min(assessmentName)
my_max <- df$assessmentName == max(assessmentName)
my_avg <- df$assessmentName == mean(assessmentName)
return(df[my_min, ])
return(df[my_max, ])
return(df[my_avg, ])
}


and

checkAssessment <- function(df, assessmentName)
{
my_min <- sapply(df$assessmentName, min)
my_max <- sapply(df$assessmentName, max)
my_avg <- sapply(mean.default(df$assessmentName, trim = 0, na.rm = FALSE,
...))
funs = c(min, max, mean)
return(df[my_min, ])
return(df[my_max, ])
return(df[my_avg, ])
}


I'm not even sure if I'm close with either of these. I'm in an introductory R course so the code should be fairly simple, but I've developed a mental block with this question.

Any help would be very much appreciated. Thank you.

Answer

Because your were given the function framework, we have to use it.

checkAssessment <- function(df, assessmentName)
{
x <- df[[assessmentName]]  ## extract column vector
return(list(min = min(x), max = max(x), avg = mean(x)))  ## use a list for multiple return
}

Note:

  1. to extract a column from a data frame by matching column name (exactly), use [[]]. It is OK to use $, but it does partial matching; Maybe this answer can help you understand this concept;
  2. be aware of R-base functions min, max and mean, so that you don't need to struggle with x[x == min(x)], etc. Even if you want this logic, you can try x[which.min(x)]. Read ?which.min for more;
  3. If you want multiple returned values, use a "list" to collect all of them. The basic way to set up a list is like list(1, 2), but a list can have names, so compare with list(a = 1, b = 2).

Test

We use R's built-in dataset trees for a test.

checkAssessment(trees, "Height")
#$min
#[1] 63

#$max
#[1] 87

#$avg
#[1] 76

It might also be worth pointing out where your code is problematic:

checkAssessment <- function(df, assessmentName)
{
my_min <- df$assessmentName == min(assessmentName)
my_max <- df$assessmentName == max(assessmentName)
my_avg <- df$assessmentName == mean(assessmentName)
return(df[my_min, ])
return(df[my_max, ])
return(df[my_avg, ])
}

First, min(assessmentName) does not make sense. Maybe you want

df$assessmentName == min(df$assessmentName)

Then, return(df[my_min, ]) is returning a data frame, a single row but multiple columns. Maybe you want:

return(df[my_min, assessmentName])

Finally, after the above return, the following won't have any effect:

return(df[my_max, assessmentName])
return(df[my_avg, assessmentName])

because the function terminates after seeing the first return. This is why you should use a "list" to get multiple returned values.

Comments