does anybody knows why I am getting different results depending on the order of the patterns?
list1 = ["AA1", "AA2","AA", "AA+"]
list2 = ["AA1", "AA2","AA+", "AA"]
results1 = "somethin with AA+ in it".scan(Regexp.union(list1))
results2 = "somethin with AA+ in it".scan(Regexp.union(list2))
In an alternation group in NFA regex, the first left-most branch "wins". See Alternation with The Vertical Bar or Pipe Symbol for a more detailed explanation.
The regexes you have are
If you use the first regex, you get
|AA| branch matches first, and the others are not tested against the input, the match is returned and the regex index advances.
The second regex yields
AA+ because the
|AA\+| matches first, and the match is returned,
|AA is not even tested.