mfq mfq - 4 months ago 5x
Ruby Question

Lookahead regex in Ruby returns `nil` on irb

I have input:

s = "<tag1 value = \"HelloWorld\" val = \"1234\">"

I want to fetch

I am using this regex expression


On rubular, it gives the expected result, but on irb, it returns

s.scan(/(?<=\")+[a-zA-Z0-9]*+(?=\\)/) # => []

Why this is happening can anybody explain ? what I am missing

s = "<tag1 value = \"HelloWorld\" val = \"1234\">"

the string value is:

<tag1 value = "HelloWorld" val = "1234">

It can be easily checked by executing e. g. puts s. You see the backslashes there because the string in ruby might be declared using double quotes and in this case the double quotes inside string are to be escaped with backslashes. Other ways to declare the same string in ruby are:

s = '<tag1 value = "HelloWorld" val = "1234">'
s = %|<tag1 value = "HelloWorld" val = "1234">|
s = <<STR
<tag1 value = "HelloWorld" val = "1234">

neither requires escaping double quotes. If you have copied the string as it was displayed in IRB to rubular, with escaping backslashes, you’ve matched another string.

That said, since there are no backslashes in the original string, nothing was matched in ruby. There are other glitches with the regexp you’ve used.

Here is the most careful version of the regexp:

s.scan /(?<=")\w+(?=")/
#⇒ ["HelloWorld", "1234"]