Jason Jorgenson Jason Jorgenson -4 years ago 137
R Question

Dynamic R dataframes - change yes/no responses to 1/0

I use an API call to LimeSurvey to get data into a Shiny R app I'm working on. I then manipulate the dataframe so that I have only the responses given by a certain individual over time. The dataframe can look like this:

Appetite <- c("No","Yes","No","No","No","No","No","No","No")
Dental.Health <- c("No","Yes","No","No","No","No","Yes","Yes","No")
Dry.mouth <- c("No","Yes","Yes","Yes","Yes","No","Yes","Yes","No")
Mouth.opening <- c("No","No","Yes","Yes","Yes","No","Yes","Yes","No")
Pain.elsewhere <- c("No","Yes","No","No","No","No","No","No","No")
Sleeping <- c("No","No","No","No","No","Yes","No","No","No")
Sore.mouth <- c("No","No","Yes","Yes","No","No","No","No","No")
Swallowing <- c("No","No","No","No","Yes","No","No","No","No")
Cancer.treatment <- c("No","No","Yes","Yes","No","Yes","No","No","No")
Support.for.my.family <- c("No","No","Yes","Yes","No","No","No","No","No")
Fear.of.cancer.coming.back <- c("No","No","Yes","Yes","No","No","Yes","No","No")
Intimacy <- c("Yes","No","No","No","No","No","No","No","No")
Dentist <- c("No","Yes","No","No","No","No","No","No","No")
Dietician <- c("No","No","Yes","Yes","No","No","No","No","No")
Date.submitted <- c("2002-07-25 00:00:00",
"2002-09-05 00:00:00",
"2003-01-09 00:00:00",
"2003-01-09 00:00:00",
"2003-07-17 00:00:00",
"2003-11-06 00:00:00",
"2004-12-17 00:00:00",
"2005-06-03 00:00:00",
"2005-12-17 00:00:00")

theDataFrame <- data.frame( Date.submitted,
Appetite,
Dental.Health,
Dry.mouth,
Mouth.opening,
Pain.elsewhere,
Sleeping,
Sore.mouth,
Swallowing,
Cancer.treatment,
Support.for.my.family,
Fear.of.cancer.coming.back,
Intimacy,
Dentist,
Dietician)


To be clear, this dataframe could contain more (or fewer) observations of more (or fewer) variables than the example above.

My goal is to make a dynamic histogram that looks like the following:

library(dplyr)
library(ggplot2)
library(tidyr)

df <- data.frame(timeline = Sys.Date() - 1:10,
q3 = sample(c("Yes", "No"), size = 10, replace = T),
q4 = sample(c("Yes", "No"), size = 10, replace = T),
q5 = sample(c("Yes", "No"), size = 10, replace = T),
q6 = sample(c("Yes", "No"), size = 10, replace = T),
q7 = sample(c("Yes", "No"), size = 10, replace = T),
q8 = sample(c("Yes", "No"), size = 10, replace = T),

stringsAsFactors = F) %>%
mutate(q3 = ifelse(q3 == "Yes", 1, 0),
q4 = ifelse(q4 == "Yes", 1, 0),
q5 = ifelse(q5 == "Yes", 1, 0),
q6 = ifelse(q6 == "Yes", 1, 0),
q7 = ifelse(q7 == "Yes", 1, 0),
q8 = ifelse(q8 == "Yes", 1, 0)

) %>%
gather(key = question, value = value, q3, q4, q5, q6, q7, q8)

g <- ggplot(df, aes(x = timeline, y = value, fill = question)) +
geom_bar(stat = "identity")

g


I think I will need to use library(lubridate) for the timeline, as the entire dataframe is plain text. I deal with the '.' in the column names like this:

myColNames <- colnames(theDataFrame)

myNames <- myColNames

myNames <- gsub("^X\\.\\.", "", myNames)
myNames <- gsub("\\.", " ", myNames)
names(theDataFrame) <- myNames # items in myChoices get "labels" from myNames


But the most challenging aspect is getting this to work dynamically. The datasets will only contain Date.submitted and (x)number of additional columns that will only be "Yes" or "No"

I hope I've given enough information (this is my first question on Stack Exchange!)

Answer Source

We can update it using base R

theDataFrame[-1] <- +(theDataFrame[-1]=="Yes")

Or with lapply when the dataset is big

theDataFrame[-1] <- lapply(theDataFrame[-1], function(x) as.integer(x=="Yes"))
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download