Bhail Bhail - 17 days ago 5
R Question

if statement won't reset with each iteration of for loop

To run this function, the csv file,outcome-of-care-measures.csv, can be found at https://github.com/Bheal/Board-Q-A.
This post only concerns this portion of the code used inside the loop in my function below:

if(num=="best"){num=1}
if(num=="worst") {num=nrow(df); print(num)}


I have put this function together. I had an idea(novice, I am) of what to do but at almost each step something needed to be tweaked to get the desired function.
But my one remaining hurdle is that I cannot seem to add an element to my loop such that if-statment assigns new value to variable
num
(if num="worst" is the function input).
(see below
# ***
)

rankall <- function( outcome, num = "best") {
## Read outcome data
tmp <- read.csv("outcome-of-care-measures.csv",na.strings="Not Available")

b1 <- outcome %in% c("heart attack","heart failure","pneumonia")

# if(){stop()}
if(b1==FALSE){ stop("Invaled output name")}

if(outcome=="heart attack") {i=11}
if(outcome=="heart failure") {i=17}
if(outcome=="pneumonia") {i=23}

t1<-as.vector(unique(tmp[['State']]))

#initialize a df for storage
dfall<- data.frame("H.name"=as.character(), "S.name"=as.character(), stringsAsFactors = FALSE)


for(x in 1:length(t1)) { # begin a loop, each state abb.

df <- subset(tmp, State==t1[x], select= c(2,i)) # subset data.frame, for state abb., & select column with Hospital name & i(outcome col).
df <- subset(df, !is.na(df[2])) # remove rows with na's in the outcome col from this data.frame.

# *** *** ***

print(dim(df)) # *** for each loop the dim(df) function is reset, but I can't get the num below in the to reset using the if statement.
# *** However if

if(num=="best"){num=1}
if(num=="worst") {num=nrow(df); print(num)} # *** this only prints one time, and is equal to the no. of rows in df from the first loop.
# *** *** ***

df <- df[order(df[2],df[1]), ] # order this data.frame. by outcome(primary) and Hosptial(secondary).

df[[1]] <- as.character(df[[1]]) # Class of First column of df changed: was class factor, changed to class char.


entry <- c(df[num,1],t1[x])

dfall <- rbind(dfall,entry, stringsAsFactors = FALSE) # ? I have to use stringsAsFactors=FALSE, else dfall won't populate properly.

}

names(dfall) <- c("Hospital", "State") # ? If I don not assign these names, d.f. dfall has wrong names(tied to 1st entry), not H.name,S.name.
return(dfall)
}


My reliance on
num
works if it equal an integer in the function call, but in case of
num
="worst" I need to pull a particular numbered entry for each iteration. (If
num
="best" that dose not effect the results, since that corresponds to the first row in each iteration).
WHY are the if statements not effected by each iteration of for loop? df is being reset in each loop and
dim(df)
changes too as evidenced by the output of
print(dim(df))
below

if(num=="best"){num=1}
if(num=="worst") {num=nrow(df); print(num)}


As seen in the output the 2nd row give prints 91 (and then num=91 is used in the remaining loops if num="worst" in the function call)

> rankall("pneumonia", "worst")
[1] 91 2
[1] 91
[1] 14 2
[1] 65 2
[1] 73 2
.
.
.
.
Hospital State
1 JACKSONVILLE MEDICAL CENTER AL
2 <NA> AK
3 <NA> AZ
4 <NA> AR
5 MARINA DEL REY HOSPITAL CA
6 <NA> CO
.
.
.


Thanks in advance.

Answer

Try this (just to show what I meant by my comment). You want to keep the num argument given in the function call and use that for each iteration. I've added a reset in the code below.

rankall2 <- function( outcome, num = "best") {
    ## Read outcome data
    tmp <- read.csv("outcome-of-care-measures.csv",na.strings="Not Available")

    b1 <- outcome %in% c("heart attack","heart failure","pneumonia")

    # if(){stop()}
    if(b1==FALSE){ stop("Invaled output name")}

    if(outcome=="heart attack") {i=11}
    if(outcome=="heart failure") {i=17}
    if(outcome=="pneumonia") {i=23}

    t1<-as.vector(unique(tmp[['State']]))

    #initialize a df for storage        
    dfall<- data.frame("H.name"=as.character(), "S.name"=as.character(), stringsAsFactors = FALSE)
    ## Keep the original num
    original.num <- num 

    for(x in 1:length(t1)) {                                # begin a loop, each state abb.
        ## Reset num
        num <- original.num

        df <- subset(tmp, State==t1[x], select= c(2,i)) # subset data.frame, for state abb., & select column with Hospital name & i(outcome col).
        df <- subset(df, !is.na(df[2]))                 # remove rows with na's in the outcome col from this data.frame.

# *** *** ***

        print(dim(df))  # *** for each loop the dim(df) function is reset, but I can't get the num below in the to reset using the if statement.
        # *** However if 

        if(num=="best"){num=1}
        if(num=="worst") {num=nrow(df); print(num)}     # ***   this only prints one time, and is equal to the no. of rows in df from the first loop.
# *** *** ***

        df <- df[order(df[2],df[1]), ]                  # order this data.frame. by outcome(primary) and Hosptial(secondary).

        df[[1]] <- as.character(df[[1]])                # Class of First column of df changed: was class factor, changed to class char.

        entry <- c(df[num,1],t1[x])

        dfall <- rbind(dfall,entry, stringsAsFactors = FALSE)   # ? I have to use stringsAsFactors=FALSE, else dfall won't populate properly.

    }

    names(dfall) <- c("Hospital", "State")          # ? If I don not assign these names, d.f. dfall has wrong names(tied to 1st entry), not H.name,S.name. 
    return(dfall)
}