Darkstarone Darkstarone - 6 months ago 9
Java Question

Java multiple replace on a single pass

I'm trying to translate nodes on a newick formatted tree, and I'm having trouble getting the replacing right. Say I have the

HashMap
:

"(1:" : "(30:"
",1:" : ",30:"
"(30:" : "(6:"
",30:" : ",6:"


And the tree:

(30:0.07,(1:0.06,2:0.76))


Conventional wisdom would suggest multiple
replaceAll
, but this poses a problem:

replaceAll("(1:", "(30:") >> (30:0.07,(30:0.06,2:0.76))
replaceAll("(30:", "(6:") >> (6:0.07,(6:0.06,2:0.76))


The problem here is we've replaced a node that was previously replaced. The correct tree should look like:

(6:0.07,(30:0.06,2:0.76))


Now I've already done this in Python:

def multiple_replace(taxa, text):
regex = re.compile("|".join(map(re.escape, taxa.keys())))
return regex.sub(lambda mo: taxa[mo.group(0)], text)


But I'm having trouble with my Java implementation:

private String convertTree (String treeOld, HashMap<String, String> conv) {
Pattern pattern = Pattern.compile("\\(\\d+:|,\\d+:");
Matcher matcher = pattern.matcher(treeOld);
StringBuilder sbt = new StringBuilder(treeOld);
while (matcher.find()) {
String replace = conv.get(matcher.group());
System.out.println(matcher.group() + "||" +replace + " || " + matcher.start() + ":"+matcher.end());
sbt.delete(matcher.start(), matcher.end());
sbt.insert(matcher.start(), replace);
}
return treeOld;

}


While the replacing appears to work, I can't get the indexing quite correct with different sizes of strings (as shown in the example). Is there a way to do this in Java?

Answer

You can use Matcher#appendReplacement to modify your string while matching.

Note that your regex can be simplified to [,(]\d+: as your alternative branches only differ in the first character ([,(] matches either , or ().

Here is an IDEONE demo:

import java.util.*;
import java.util.regex.*;
import java.lang.*;
import java.io.*;

class Ideone
{
    public static void main (String[] args) throws java.lang.Exception
    {
        String tree = "(30:0.07,(1:0.06,2:0.76))";
        HashMap<String, String> h = new HashMap<String, String>();
        h.put("(1:" , "(30:");
        h.put(",1:" , ",30:");
        h.put("(30:" , "(6:");
        h.put(",30:" , ",6:");
        System.out.println(convertTree(tree, h));

    }
    private static String convertTree(String treeOld, HashMap<String, String> conv) {
        Pattern pattern = Pattern.compile("[,(]\\d+:");  // Init the regex
        Matcher m = pattern.matcher(treeOld);            // Init the matcher
        StringBuffer result = new StringBuffer();        // Declare the string buffer (can be replaced with a string builder)
        while (m.find()) {                               // Iterate through matches
            if (conv.containsKey(m.group(0))) {          // Check if the key exists
                m.appendReplacement(result, conv.get(m.group(0))); // If yes, use the HashMap value
            }
            else {
                m.appendReplacement(result, m.group(0));  // Else, just reinsert the match value
            }
        }
        m.appendTail(result);        // Append what remains to the result
        return result.toString();

    }
}
Comments