R Question

Preparing character strings for sql in () query within dbGetQuery

I have written the below query to study some sample means for my analysis and this work fine and I do not require to mug up this code. However I desparately need some help to comprehend the use of "",; sep="" alongth with limit statement in dbGetQuery.

df <- data.frame(Sample.Num = integer(),
Sample.Mean = integer(),
quant.01 = integer(),
quant.05 = integer(),
quant.10 = integer(),
quant.25 = integer(),
quant.50 = integer(),
quant.75 = integer(),
stringsAsFactors = FALSE)
df[1,] <- NA
for (i in 1:500){
sdf <- dbGetQuery(con,"select col11,col23,col30
where col1 in ('value1')
and col2 in ('(v3) - value3')
and col3 in ('v0123 - value4')
order by random()*600000 limit 100")
meansample <- mean(sdf$mileage,na.rm = TRUE)
quant.01 <- quantile(sdf$mileage,na.rm = TRUE,probs = .01)
quant.05 <- quantile(sdf$mileage,na.rm = TRUE,probs = .05)
quant.10 <- quantile(sdf$mileage,na.rm = TRUE,probs = .10)
quant.25 <- quantile(sdf$mileage,na.rm = TRUE,probs = .25)
quant.50 <- quantile(sdf$mileage,na.rm = TRUE,probs = .50)
quant.75 <- quantile(sdf$mileage,na.rm = TRUE,probs = .75)

Col1,2 & 3 have character values.
Now I need to check the same for all 7 values of col2 and 9 values of col3. Whenever I use any specific values of col1,2,3 I need to store them in a data frame to perform some operations before the loop ends and restarts with second value.
Hence this calls for studying
How to add a dynamic value into RMySQL getQuery
I did study gsub and shQuote as well. I am unable to conceptualize the use of '", df[i], "';", sep = "" despite of several combinations along with limit statement
Had it been a local database I could have used dplyr package to run the loop. I understand seq_along() requires a numeric value and in order to pass through the loop which sets right 'correct value' inside in () query. I did try paste0 and paste arguements along with collapse="," arguements but did not help.
I also tried lapply from below post but it does not work inside in () query.
How to do dbGetQuery for loop in R

(v3) - value3, (v1) - value1 etc are the actual elements in col. This is combination of code and description of the attribute col2, col3 etc.

Answer Source

Let’s assume that you have a query and want to substitute some range values in it.

sql <- "select col11,col23,col30
                    where col1 in (%s)
                    and col2 in (%s)
                    and col3 in (%s)
                    order by random()*600000 limit 100"

Here I will use sprintf to do the substitutions in the query, so I have left markers (%s) in the string. So let’s setup 3 ranges that we want to test for:

col1 <- 1:10
col2 <- c('a', 'f', 'z')
col3 <- c('name1', 'name2')

# create strings that are valid in SQL
col1_sql <- paste(col1, collapse = ',')
col2_sql <- paste0("'", col2, "'", collapse = ',')  # put quotes on strings
col3_sql <- paste0("'", col3, "'", collapse = ',')

# now substitute back in query
sql_new <- sprintf(sql,
                   col1_sql,  # strings we just constructed

# print out the query

Here is what the query will be:

select col11,col23,col30
    where col1 in (1,2,3,4,5,6,7,8,9,10)
    and col2 in ('a','f','z')
    and col3 in ('name1','name2')
    order by random()*600000 limit 100
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download