Petrik Petrik - 3 months ago 21
PowerShell Question

What change in Powershell 5 changes meaning of block curly brackets

We recently updated the Powershell version on our build servers from 4.0 to 5.0. This change caused one of our build scripts to start failing in an unexpected way.

The code is used to determine which user guides should be included in our product. The code processes a list of xml nodes that describe all available documents with version and culture. We group by document title and culture and then select the most fitting version.

$documents = Get-ListItemsFromSharePoint
$documents = $documents |
Where-Object { $productVersion.CompareTo([version]$_.ows_Product_x0020_Version) -ge 0 } |
Where-Object { -not ($_.ows_EncodedAbsUrl.Contains('/Legacy/')) }

Write-Verbose -Message "Filtered to: $($documents.length) rows"

# Filter to the highest version for each unique title per language
$documents = $documents | Group-Object { $_.ows_Title, $_.ows_Localisation } |
ForEach-Object {
$_.Group | Sort-Object { [version]$_.ows_Product_x0020_Version } -Descending | Select-Object -First 1
}


In Powershell 4 this code correctly sorts the documents by title and culture and then selects the most suitable version. In Powershell 5 this code groups all documents in a single list and then selected the most suitable version from that list. Given that we have documents in multiple languages this means that only the language with the most suitable version will be present.

The issue was fixed by changing

$documents = $documents | Group-Object { $_.ows_Title, $_.ows_Localisation } |


to

$documents = $documents | Group-Object ows_Title, ows_Localisation |


Now I understand that the first syntax is not technically correct according to the documentation because Group-Object expects an array of property names to group on, however in Powershell 4 the code did return the desired results.

The question now is what changed in Powershell 5 that the original code worked in Powershell 4 but failed in Powershell 5.

Answer

It doesn't look like the syntax of the Group-Object cmdlet was changed, as the following shows the same definition (along with the DLL where the method is defined) for both versions:

gcm Group-Object | fl DLL,Definition


DLL        : C:\WINDOWS\Microsoft.Net\assembly\GAC_MSIL\Microsoft.PowerShell.Commands.Utility\v4.0_
             3.0.0.0__31bf3856ad364e35\Microsoft.PowerShell.Commands.Utility.dll
Definition :
             Group-Object [[-Property] <Object[]>] [-NoElement] [-AsHashTable] [-AsString]
             [-InputObject <psobject>] [-Culture <string>] [-CaseSensitive] [<CommonParameters>]

But as PetSerAL mentioned in a comment it looks like the 5.0 DLL handles arrays differently than 4.0. For example:

$a=[PSCustomObject]@{item=@(1,2)}   #Object with an array for the value of the item property
$b=[PSCustomObject]@{item=@(3,3)}   #Object with a different array for the value of the item property
$a.item.Equals($b.item)   #This deep compare is false, as the two objects are not equal
$a.item.GetType().Equals($b.item.GetType())   #This "shallow" compare is true because both are the array type.
$c=[PSCustomObject]@{item=@{key='value'}}   #Similar but this time the item value is a hashtable
$d=[PSCustomObject]@{item=@{anotherkey='anothervalue'}}   #again comparing the two items we expect the result to be false if deep compared but true if shallow compared
$e=[PSCustomObject]@{item=get-date} #another test using two datetimes (another common "reference" type)
$f=[PSCustomObject]@{item=[datetime]::MinValue}
$a,$b,$c,$d,$e,$f | group -Property item   #now we see what happens when using group-object

#Output in PowerShell 4.0
Count Name                      Group
----- ----                      -----
    1 {1, 2}                    {@{item=System.Object[]}}
    1 {3, 3}                    {@{item=System.Object[]}}
    2 {System.Collections.Di... {@{item=System.Collections.Hashtable}, @{item=System.Collections...
    1 8/5/2016 9:45:36 PM       {@{item=8/5/2016 9:45:36 PM}}
    1 1/1/0001 12:00:00 AM      {@{item=1/1/0001 12:00:00 AM}}

#Output in PowerShell 5.0
Count Name                      Group
----- ----                      -----
    2 {1, 2}                    {@{item=System.Object[]}, @{item=System.Object[]}}
    2 {System.Collections.Di... {@{item=System.Collections.Hashtable}, @{item=System.Collections...
    1 8/5/2016 9:45:40 PM       {@{item=8/5/2016 9:45:40 PM}}
    1 1/1/0001 12:00:00 AM      {@{item=1/1/0001 12:00:00 AM}}

Note that in version 4 the array values were treated as separate groups, but the hash tables are treated as equal groups. That means arrays had a deep compare, but hashtables were a shallow compare (all hashtables are treated as equivalent)

Now in version 5 the arrays are treated as equivalent, meaning they are a shallow compare similar to how the hashtables worked.

If you want to see the full details you would need to use ilspy or .Net Reflector to disassemble the DLL and compare the DoGrouping method of the Microsoft.PowerShell.Commands.GroupObjectCommand class. Hard to say if it is a bug or not, but it definitely is a breaking change for the group-object cmdlet.

Update: the more I play with this the more I think the new code is correct (except the displayed name should just be System.Object) and there was a bug in the old code. It seems like v4 was doing some sort of string based comparison, as even two different arrays with the same elements would be grouped together, even though $a.Equals([PSCustomObject]@{item=@(1,2)}) is always false (their GetHashCode method results don't match). The only way I could get 5.0 to group similar arrays was using group -Property {$_.item -join ','}, which matched the 4.0 output except the name was then 1,2 instead of {1, 2}. Also if you want to group using a key of a hashtable item you would use group -Property {$_.item.somekey} (assuming they all have a value for somekey)

Comments