DBS DBS - 5 months ago 17
PowerShell Question

Usage of | in PowerShell regex

I'm trying to split some text using PowerShell, and I'm doing a little experimenting with regex, and I would like to know exactly what the "|" character does in a PowerShell regex. For example, I have the following line of code:

"[02]: ./media/active-directory-dotnet-how-to-use-access-control/acs-01.png" | select-string '\[\d+\]:' | foreach-object {($_ -split '\[|\]')}

Running this line of code gives me the following output:

-blank line-
: ./media/active-directory-dotnet-how-to-use-access-control/acs-01.png

If I run the code without the "|" in the -split statement as such:

"[02]: ./media/active-directory-dotnet-how-to-use-access-control/acs-01.png" | select-string '\[\d+\]:' | foreach-object {($_ -split '\[\]')}

I get the following output without the [] being stripped (essentially it's just displaying the select-string output:

[02]: ./media/active-directory-dotnet-how-to-use-access-control/acs-01.png

If I modify the code and run it like this:

"[02]: ./media/active-directory-dotnet-how-to-use-access-control/acs-01.png" | select-string '\[\d+\]:' | foreach-object {($_ -split '\[|')}

In the output, the
is stripped from the beginning but the output has a carriage return after each character (I did not include the full output for space purposes).




The answers already explain what the | is for but I would like to explain what is happening with each example that you have above.

  1. -split '\[|\]': You are trying to match either [ or ] which is why you get 3 results. The first being a blank line which is the whitespace represented by the beginning of the line before the first [

  2. -split '\[\]': Since you are omitting the | symbol in this example you are requesting to split on the character sequence [] which does not appear in your string. This is contrasted by the code $_.split('\[\]') which would split on every character. This is by design.

  3. -split '\[|': Here you are running into a caveat of not specifying the right hand operand for the | operator. To quote the help from Regex101 when this regex is specified:

(null, matches any position)

Warning: An empty alternative effectively truncates the regex at this point because it will always find a zero-width match

Which is why the last example split on every element. Also, I dont think any of this is PowerShell only. This behavior should be seen on other engines as well.