Bobbie Bobbie - 29 days ago 15
Javascript Question

Regular expression to fetch beginning of string or a symbol

I am writing a function to find attributes value from given string and given attribute name.

The input stings look like those below:

sip:+19999999999@trunkgroup2:5060;user=phone
<sip:+19999999999;tgrp=0180401;trunk-context=aaaa.aaaa.ca@10.10.10.100:8000;user=phone;transport=udp>
<sip:19999999999;tgrp=0306001;trunk-context=aaaa.aaaa.ca@10.10.10.100:8000;transport=udp>
<sip:+19999999999;tgrp=SMPPDIN;trunk-context=aaaa.aaaa.ca@10.10.10.100:8000;transport=udp>


After few hours I came out with this regular expression:
/(\Wsip[:,+,=]+)(\w+)/g
, but this is not working for the first example - as there is no not a word character before the attributes name.

How can I fix this expression to fetch both cases -
<sip...
and
sip..
only when it is the beginning of the string.

I use this function to extract both
sip
and
tgrp
values.

Answer Source

Replace \W with \b, and use

\b(sip[:+=]+)(\w+)

Or, to match at the beginning of a string:

^\W?(sip[:+=]+)(\w+)

See the first regex demo and the second regex demo.

As \W is a consuming pattern matching any non-word char (a char other than a letter/digit/_) you won't have a match at the start of the string. A \b word boundary will match at the start of the string and in case there is a non-word char before s.

If you literally need to find a match at the beginning of a string after an optional non-word char, the \W must be replaced with ^\W? where ^ match the start of a string, and \W? matches 1 or 0 non-word chars.

Also, note that , inside a character class is matched as a literal ,. If you mean to use it to enumerate chars, you should remove it.

Pattern details:

  • \b - a word boundary
    OR
  • ^ - start of string
  • \W? - 1 or 0 (due to the ? quantifier) non-word chars (i.e. chars other than letters/digits and _)

  • (sip[:+=]+) - Group 1: sip substring followed with one or more :, + or = chars

  • (\w+) - Group 2: one or more word chars.