Java Regex replace : and / except the domain name in the url to white space

I have a long string, including lots of

. It also includes urls.

I want to replace all
but the domain name (e.g.,
) of the url's to white space.

will become
url test page.html

I tried
replaceAll("[://]", " ")
but it also replaces
to white space.

Answer Source

Since you need to keep some pattern in one context and replace with something else in the other, you can use the regex to match and capture URLs (and anything you want to "protect") and just match what you need to remove. Then, use Matcher#appendReplacement() to check if the capture took place, and use the appropriate replacement accordingly.

The regex can be similar to (\\bhttps?://\\S*)|[:/] where (\\bhttps?://\\S*) matches and captures into Group 1 a http:// or https:// and then 0+ non-whitespace chars, and [:/] matches either : or / (to be replaced with a space).

Here is a sample code:

String fileText = "  1: 2/";
String pattern = "(\\bhttps?://\\S*)|[:/]";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(fileText);

StringBuffer sb = new StringBuffer();
while (m.find()) {
    if ( != null)
        m.appendReplacement(sb, " ");

See the Java demo.

