Calzor Suzay Calzor Suzay - 4 months ago 72
PowerShell Question

Powershell regex to remove comma but not delimiter

I'm bringing a delimited CSV into PowerShell to remove line feeds with the code below which works but I've found I need to remove extra commas as well but not sure how to format the regex.

(Get-Content -Path $fullpath -Raw) -replace '(?<!\x0d)\x0a'," " | out-file $fullpath


My input CSV is:

"field1","field2","field3","fred,mary,john,brian","field5"


So I want to keep
","
(including the quotes), but remove any other commas on their own.

So I'd end up with:

"field1","field2","field3","fred mary john brian","field5"

Answer

This will do it.

(Get-Content -Path $fullpath -Raw) -replace ',(?!")'," " | out-file $fullpath

The negative lookahead (?!") after the , assures that the , that are followed by a " are not matched.

You could also include a negative lookbehind to it. F.e. (?<!"),(?!")
But that's probably overkill.

To only remove the comma's of a specific field, you could use Import-Csv with a Header.

$csvin = Import-Csv -Path $fullpath -Header f1,f2,f3,f4,f5
$csvin |%{$_.f4=$_.f4.replace(',',' ')}
$csvin |ConvertTo-Csv -NoTypeInformation |Select-Object -Skip 1 |Set-Content -Path $fullpath