RichT RichT - 2 months ago 26
Bash Question

Extract multiple lines of text between two key words from shell command in powershell

I have a shell command I'd like to extract data from using Powershell. The data I need will always sit between two key words and the number of lines captured can change.

The output can look something like this.

Sites:
System1:
RPAs: OK
Volumes:
WARNING: Storage group DR_UCS_01-08 contains both replicated and unreplicated volumes. ; CS_TX
WARNING: Storage group DR_UCS_21-28 contains both replicated and unreplicated volumes. ; CS_TX
WARNING: Storage group DR_UCS_31-38 contains both replicated and unreplicated volumes. ; CS_TX
Splitters: OK
System2:
RPAs: OK
Volumes:
WARNING: Storage group MA_UCS_1 contains both replicated and unreplicated volumes. ; CS_MA
WARNING: Storage group MA_UCS_2 contains both replicated and unreplicated volumes. ; CS_MA
WARNING: Storage group MA_UCS_3 contains both replicated and unreplicated volumes. ; CS_MA
Splitters: OK
WAN: OK
System: OK


I would like to capture and store into a variable (or text file if easier?) part of this data to be reused later in the script. For example, I would like to capture everything between System1: and System2: which would produce:

RPAs: OK
Volumes:
WARNING: Storage group DR_UCS_01-08 contains both replicated and unreplicated volumes. ; CS_MA
WARNING: Storage group DR_UCS_21-28 contains both replicated and unreplicated volumes. ; CS_MA
WARNING: Storage group DR_UCS_31-38 contains both replicated and unreplicated volumes. ; CS_MA
Splitters: OK


I've been messing with different regex combinations with no success. I've had some moderate success with this code but it doesn't seem to be able to handle the warning lines and I also can't seem to get Out-File to work with it either, only Write-Host which does not help me much.

$RP = plink -l User -pw Password 192.168.1.100 "get_system_status summary=no" #extract from

$script = $RP

$in = $false

$script | %{
if ($_.Contains("System1"))
{ $in = $true }
elseif ($_.Contains("System2"))
{ $in = $false; }
elseif ($in)
{ Write-Host $_ }
}


Ideally I'd like to be able to take this script and use it to parse data from any shell command. I'm currently lost and almost ready to give up on this.

Answer

One option is to join the text with newlines, then use -split with a multi-line regex:

$text = 
(@'
Sites:
System1: 
RPAs: OK
Volumes: 
  WARNING: Storage group DR_UCS_01-08 contains both replicated and unreplicated volumes. ; CS_TX
  WARNING: Storage group DR_UCS_21-28 contains both replicated and unreplicated volumes. ; CS_TX
  WARNING: Storage group DR_UCS_31-38 contains both replicated and unreplicated volumes. ; CS_TX
Splitters: OK
System2: 
RPAs: OK
Volumes: 
  WARNING: Storage group MA_UCS_1 contains both replicated and unreplicated volumes. ; CS_MA
  WARNING: Storage group MA_UCS_2 contains both replicated and unreplicated volumes. ; CS_MA
  WARNING: Storage group MA_UCS_3 contains both replicated and unreplicated volumes. ; CS_MA
Splitters: OK
WAN: OK
System: OK
'@).split("`n") |
foreach {$_.trim()} 

$text -join "`n" -split '(?ms)(?=^System\d+:\s*)' -match '^System\d+:'

System1:
RPAs: OK
Volumes:
WARNING: Storage group DR_UCS_01-08 contains both replicated and unreplicated volumes. ; CS_TX
WARNING: Storage group DR_UCS_21-28 contains both replicated and unreplicated volumes. ; CS_TX
WARNING: Storage group DR_UCS_31-38 contains both replicated and unreplicated volumes. ; CS_TX
Splitters: OK

System2:
RPAs: OK
Volumes:
WARNING: Storage group MA_UCS_1 contains both replicated and unreplicated volumes. ; CS_MA
WARNING: Storage group MA_UCS_2 contains both replicated and unreplicated volumes. ; CS_MA
WARNING: Storage group MA_UCS_3 contains both replicated and unreplicated volumes. ; CS_MA
Splitters: OK
WAN: OK
System: OK

Edit: a more generic solution to just capturing the output between two specific keywords:

$regex = '(?ms)System1:(.+?)System2:'

$text = $text -join "`n"

$OutputText = 
[regex]::Matches($text,$regex) |
 foreach {$_.groups[1].value -split }