Kalinkin Alexey Kalinkin Alexey - 16 days ago 4
R Question

Substitute the ^ (power) symbol with C's pow syntax in mathematical expression

I have a math expression, for example:

((2-x+3)^2+(x-5+7)^10)^0.5


I need to replace the
^
symbol to
pow
function of C language. I think that regex is what I need, but I don't know a regex like a pro. So I ended up with this regex:

(\([^()]*)*(\s*\([^()]*\)\s*)+([^()]*\))*


I don't know how to improve this. Can you advice me something to solve that problem?

The expected output:

pow(pow(2-x+3,2)+pow(x-5+7,10),0.5)

Answer

Here is a solution that follows the parse tree recursively and replaces ^:

#parse the expression
#alternatively you could create it with
#expression(((2-x+3)^2+(x-5+7)^10)^0.5)
e <- parse(text = "((2-x+3)^2+(x-5+7)^10)^0.5")

#a recursive function
fun <- function(e) {    
  #check if you are at the end of the tree's branch
  if (is.name(e) || is.atomic(e)) { 
    #replace ^
    if (e == quote(`^`)) return(quote(pow))
    return(e)
  }
  #follow the tree with recursion
  for (i in seq_along(e)) e[[i]] <- fun(e[[i]])
  return(e)    
}

#deparse to get a character string    
deparse(fun(e)[[1]])
#[1] "pow((pow((2 - x + 3), 2) + pow((x - 5 + 7), 10)), 0.5)"

This would be much easier if rapply worked with expressions/calls.

Edit:

OP has asked regarding performance. It is very unlikely that performance is an issue for this task, but the regex solution is not faster.

library(microbenchmark)
microbenchmark(regex = {
  v <- "((2-x+3)^2+(x-5+7)^10)^0.5"
  x <- grepl("(\\(((?:[^()]++|(?1))*)\\))\\^(\\d*\\.?\\d+)", v, perl=TRUE)
  while(x) {
    v <- sub("(\\(((?:[^()]++|(?1))*)\\))\\^(\\d*\\.?\\d+)", "pow(\\2, \\3)", v, perl=TRUE);
    x <- grepl("(\\(((?:[^()]++|(?1))*)\\))\\^(\\d*\\.?\\d+)", v, perl=TRUE)
  }
},
BrodieG = {
  deparse(f(parse(text = "((2-x+3)^2+(x-5+7)^10)^0.5")[[1]]))
},
Roland = {
  deparse(fun(parse(text = "((2-x+3)^2+(x-5+7)^10)^0.5"))[[1]])
})

#Unit: microseconds
#    expr     min      lq     mean  median      uq     max neval cld
#   regex 321.629 323.934 335.6261 335.329 337.634 384.623   100   c
# BrodieG 238.405 246.087 255.5927 252.105 257.227 355.943   100  b 
#  Roland 211.518 225.089 231.7061 228.802 235.204 385.904   100 a

I haven't included the solution provided by @digEmAll, because it seems obvious that a solution with that many data.frame operations will be relatively slow.

Edit2:

Here is a version that also handles sqrt.

fun <- function(e) {    
  #check if you are at the end of the tree's branch
  if (is.name(e) || is.atomic(e)) { 
    #replace ^
    if (e == quote(`^`)) return(quote(pow))
    return(e)
  }
  if (e[[1]] == quote(sqrt)) {
    #replace sqrt
    e[[1]] <- quote(pow)
    #add the second argument
    e[[3]] <- quote(0.5)
  }
  #follow the tree with recursion
  for (i in seq_along(e)) e[[i]] <- fun(e[[i]])
  return(e)    
}

e <- parse(text = "sqrt((2-x+3)^2+(x-5+7)^10)")
deparse(fun(e)[[1]])
#[1] "pow(pow((2 - x + 3), 2) + pow((x - 5 + 7), 10), 0.5)"
Comments