Kalinkin Alexey - 1 year ago 88
R Question

# Substitute the ^ (power) symbol with C's pow syntax in mathematical expression

I have a math expression, for example:

``````((2-x+3)^2+(x-5+7)^10)^0.5
``````

I need to replace the
`^`
symbol to
`pow`
function of C language. I think that regex is what I need, but I don't know a regex like a pro. So I ended up with this regex:

``````(\([^()]*)*(\s*\([^()]*\)\s*)+([^()]*\))*
``````

I don't know how to improve this. Can you advice me something to solve that problem?

The expected output:

``````pow(pow(2-x+3,2)+pow(x-5+7,10),0.5)
``````

Here is a solution that follows the parse tree recursively and replaces `^`:

``````#parse the expression
#alternatively you could create it with
#expression(((2-x+3)^2+(x-5+7)^10)^0.5)
e <- parse(text = "((2-x+3)^2+(x-5+7)^10)^0.5")

#a recursive function
fun <- function(e) {
#check if you are at the end of the tree's branch
if (is.name(e) || is.atomic(e)) {
#replace ^
if (e == quote(`^`)) return(quote(pow))
return(e)
}
for (i in seq_along(e)) e[[i]] <- fun(e[[i]])
return(e)
}

#deparse to get a character string
deparse(fun(e)[[1]])
#[1] "pow((pow((2 - x + 3), 2) + pow((x - 5 + 7), 10)), 0.5)"
``````

This would be much easier if `rapply` worked with expressions/calls.

Edit:

OP has asked regarding performance. It is very unlikely that performance is an issue for this task, but the regex solution is not faster.

``````library(microbenchmark)
microbenchmark(regex = {
v <- "((2-x+3)^2+(x-5+7)^10)^0.5"
x <- grepl("(\\(((?:[^()]++|(?1))*)\\))\\^(\\d*\\.?\\d+)", v, perl=TRUE)
while(x) {
v <- sub("(\\(((?:[^()]++|(?1))*)\\))\\^(\\d*\\.?\\d+)", "pow(\\2, \\3)", v, perl=TRUE);
x <- grepl("(\\(((?:[^()]++|(?1))*)\\))\\^(\\d*\\.?\\d+)", v, perl=TRUE)
}
},
BrodieG = {
deparse(f(parse(text = "((2-x+3)^2+(x-5+7)^10)^0.5")[[1]]))
},
Roland = {
deparse(fun(parse(text = "((2-x+3)^2+(x-5+7)^10)^0.5"))[[1]])
})

#Unit: microseconds
#    expr     min      lq     mean  median      uq     max neval cld
#   regex 321.629 323.934 335.6261 335.329 337.634 384.623   100   c
# BrodieG 238.405 246.087 255.5927 252.105 257.227 355.943   100  b
#  Roland 211.518 225.089 231.7061 228.802 235.204 385.904   100 a
``````

I haven't included the solution provided by @digEmAll, because it seems obvious that a solution with that many data.frame operations will be relatively slow.

Edit2:

Here is a version that also handles `sqrt`.

``````fun <- function(e) {
#check if you are at the end of the tree's branch
if (is.name(e) || is.atomic(e)) {
#replace ^
if (e == quote(`^`)) return(quote(pow))
return(e)
}
if (e[[1]] == quote(sqrt)) {
#replace sqrt
e[[1]] <- quote(pow)