Ann Ann - 6 days ago 5
Scala Question

How to remove a substring between two specific characters in Scala

I have this List in Scala:

List[String] = List([[aaa|bbb]], [[ccc|ddd]], [[ooo|sss]])


And I want to obtain the same List with the substrings between | and ] removed and | removed too.

So the result would be:

List[String] = List([[aaa]], [[ccc]], [[ooo]])


I tried something making a String with the List and using replaceAll, but I want to conserve the List.

Thanks.

Answer

You can use a simple \|.*?]] regex to match these substrings you need to remove.

Here is a way to perform the replacement in Scala code:

val l = List[String]("[[aaa|bbb]]", "[[ccc|ddd]]", "[[ooo|sss]]")
println(l.map(x => x.replaceAll("""\|.*?(]])""", "$1"))) 

See the Scala demo

I added a capturing group around ]] and used a $1 backreference in the replacement pattern to insert the ]] back into the result.

Details:

  • \| - a literal | pi[e symbol (since it is a special char outide of a character class, it must be escaped)
  • .*? - any zero or more symbols other than line break symbols
  • (]]) - Group 1 capturing ]] substring (note that ] outside of a character class does not need escaping, it is just the opposite of the case with |).
Comments