paul543 paul543 - 3 months ago 15
PowerShell Question

Removing white spaces from string line and using it in an array

I was looking through the old posts but I cannot find the very specific answer how to replace the white space in the string and then use it as an array.

The input file consists the lines:

-rw-r--r-- 1 myuser admin 315279199 May 12 02:46 2016_05_12_backup.tar.gz
-rw-r--r-- 1 myuser admin 315278122 May 13 04:56 2016_05_13_backup.tar.gz


I want to receive the following output:

program executed on 2016-05-16 / 12:18:06
to unix:
rm -fr 2016_05_12_backup.tar.gz
rm -fr 2016_05_13_backup.tar.gz


to excel log:
2016_05_12_backup.tar.gz
2016_05_13_backup.tar.gz

=============== END ==============


My code is here:

$path_in = "C:\test\input.txt"
$path_out = "C:\test\output.txt"

$endMessage = "=============== END =============="

$reader = [System.IO.File]::OpenText($path_in)
$get_time_message = "program executed on " + [datetime]::now.ToString('yyyy-MM-dd / HH:mm:ss')

try {

add-content $path_out $get_time_message
add-content $path_out "to unix: "

$long_string_to_excel =""


while($true){
$line = $reader.ReadLine()

if ($line -eq $null) { break }

# divide the input line into array - remove white space
# it is hard coded here below for the lines that consist two and three space characters

$better_line = $line.replace(' ',' ')
$best_line = $better_line.replace(' ',' ').split(' ')

$stringToOutput = "rm -fr " + $best_line[8]

$long_string_to_excel = $long_string_to_excel + $best_line[8] + "`r`n"

add-content $path_out $stringToOutput

}

add-content $path_out "`n"
add-content $path_out "to excel log: "
add-content $path_out $long_string_to_excel
add-content $path_out $endMessage

}
finally {
$reader.Close()
}
write-host "program execution:`ncompleted"


This script works ok, but it is "hard" coded for the input lines that consist two and three space characters. I wanted to use

$better_line = $line.replace(' +', ' ');
$best_line = $better_line.split(' ')


instead of

$better_line = $line.replace(' ',' ')
$best_line = $better_line.replace(' ',' ').split(' ')


but the results are incorrect:

program executed on 2016-05-16 / 12:18:04
to unix:
rm -fr 315279199
rm -fr 315278122


to excel log:
315279199
315278122

=============== END ==============


Could you please advise on the solution how to replace the hard coded part so the script works for any type of white space in the single line?

Answer

Instead of the static String.Split() method, use the builtin -split operator - it supports regular expressions so you can use it to split by "1 or more spaces" for example:

PS C:\> "a   b" -split '\s+'
a
b
PS C:\> "a b" -split '\s+'
a
b