mwebber mwebber - 2 months ago 7
Javascript Question

Javascript Regex help comma seperated text

I have this string:

remote:City|Vestavia Hills,AL,remote:Citystate|Vestavia Hills,395b5231539390675a7abe0751fc4820,remote:City|Vestavia Hills,AL,remote:Citystate|Vestavia Hills,395b5231539390675a7abe0751fc4820


I want to match and extract from it strings separated with commas.

The result should be:

MATCH 1
'remote:City|Vestavia Hills,AL'
MATCH 2
'remote:Citystate|Vestavia Hills'
MATCH 3
'395b5231539390675a7abe0751fc4820'
MATCH 4
'remote:City|Vestavia Hills,AL'
MATCH 5
'remote:Citystate|Vestavia Hills'
MATCH 6
'395b5231539390675a7abe0751fc4820'


I have this regex:

(remote:[a-zA-Z]+\|[^\,]+|[a-f0-9]{32})


but those cities which have state 'AL' (separated with comma) are separated incorrectly.

Possible solution:

I was thinking of doing something like this -
remote:[a-zA-Z]+\|.*
- and end match on the comma which have after it self (
remote:[a-zA-Z]+\|.*
) or md5 hash (
[a-f0-9]{32},?
).

I lack skills with excluding matches so I was hoping someone can help me with this task. Here is my regex tester link:

https://regex101.com/r/rP8iJ2/1

Answer

You can fine-tune your regex into this lookahead based regex:

/(?:^|,)(.+?(?=,(?:[a-f0-9]{32}|remote:)|$))/igm

This will give 6 captured groups as you're expecting.

Updated RegEx Demo

(?:^|,)                 # Match line start or comma
(                       # captured group #1 start
   .+?                  # match 1 or more of any character (lazy)
   (?=                  # lookahead start
      ,                 # match comma followed by
      (?:               # non-capturing group start
         [a-f0-9]{32}   # match hex digit 32 times
         |              # OR
         remote:        # match literal "remote:"
      )                 # non-capturing group end
      |                 # OR
      $                 # line end
   )                    # looakehad end
)                       # capturing group #1 end