Can anyone tell me why
This is not an anomaly:
.* can match anything.
You ask to replace all occurrences:
.*also matches an empty string! It therefore matches an empty string at the end of the input, and replaces it with
.+ instead will not exhibit this problem since this regex cannot match an empty string (it requires at least one character to match).
.replaceFirst() to only replace the first occurrence:
"test".replaceFirst(".*", "a") ^^^^^^^^^^^^
.* behaves like it does and does not match more than twice (it theoretically could) is an interesting thing to consider. See below:
# Before first run regex: |.* input: |whatever # After first run regex: .*| input: whatever| #before second run regex: |.* input: whatever| #after second run: since .* can match an empty string, it it satisfied... regex: .*| input: whatever| # However, this means the regex engine matched an empty input. # All regex engines, in this situation, will shift # one character further in the input. # So, before third run, the situation is: regex: |.* input: whatever<|ExhaustionOfInput> # Nothing can ever match here: out
Note that, as @A.H. notes in the comments, not all regex engines behave this way. GNU
sed for instance will consider that it has exhausted the input after the first match.