Geogrammer Geogrammer - 2 months ago 16
R Question

Using a custom function (ifelse) with dcast

I'm interested in reshaping a dataframe, but instead of using standard dcast functions like the mean, I'd like to use a custom function. Specifically, I'm interested in using an ifelse statement to assign binary values.

Here's a reproducible example:

# dataframe that includes extraneous information
df <- data.frame(sale_id=c(1,1,1,2,2,2,3,3,4,5),project_id=c(501,502,503,501,502,503,501,502,504,505),
sale_year=c(1990,1991,1993,1990,1992,1990,1991,1993,1990,1992),
var1=c(5,4,3,6,5,4,4,7,2,9),var2=c(7,3,4,8,5,8,2,3,5,7))

# list of the variables I actually need (I don't need 'sale_year')
varlist <- c("var1","var2")

# selecting out id variables and variables I'm interested in manipulating
dfvars <- df[,c("sale_id","project_id",varlist)]

# melt dataframe
library(reshape2)
mdata <- melt(dfvars, id=c('sale_id','project_id'))

# create custom ifelse function, assign '1' if mean is above a critical value, and '0' if not
funx <- function(u){ifelse(mean(u)>5,1,0)}

# cast data using this function
cdata <- dcast(mdata, sale_id~variable, funx)


It works if I just use a standard function, like mean (ex):

cdata <- dcast(mdata, sale_id~variable, mean)


But with my ifelse() function, I get an error about data types (logical vs. double), which doesn't make sense to me, since the result of "mean(u) > 5" should be returning a logical result (TRUE or FALSE), to then be used by the ifelse() part.

Answer

I believe this has to do with the details of type coercion. The return of your custom function is being treated as double for some sets of observations, but logical in others. The code works when you make explicit the return type.

Example:

# Works
funx1 <- function(u){ifelse(mean(u)>5,TRUE,FALSE)}
funx2 <- function(u){as.logical(ifelse(mean(u)>5,1,0))}
funx3 <- function(u){as.numeric(ifelse(mean(u)>5,1,0))}