Charlie Charlie - 3 months ago 13
PowerShell Question

Re-format numerous dates (each different) in .txt file

I have numerous .txt files that are output from a

handle.exe
run of several days. I need to reorganize the data to get it into a relational database. The first thing I need to do is get the dates re-formatted.

Each file has in excess of 800 dates, disbursed unevenly throughout the file. The dates are formatted:

June 29, 2016 12:05:45 PM
and I need
06-29-16 12:05:45
.

I'm just working on a single file for now, to get things dialed in. I've tried to replace the dates in situ (using an array for the original dates) with
Get-Date
and got nowhere. Then I tried
-replace
and that didn't work.

I've spent 3 or 4 days on this and I think I've broken my head. I've tried so many permutations of stuff that I don't know even where I am anymore.

The last thing I tried was below. An attempt to use a hashtable, with the old date and new date in the table.

##To set "|" as separator for arrays
$OFS = '|'

##To get original dates into array
$a = @(sls .\hp.txt -pattern '(june 29|june 30|july 1|july 2|july 3|july 4)' | select -ExpandProperty line)

##To get dates with corrected format into array
$b = @($a | foreach {$_ | Get-Date -Format "MM-dd-yy hh:mm:ss"})

##To get old and new dates into hash table
$dates = @{$a = $b}

##To bring in content from file
$file = (Get-Content C:\hp.txt)

##To replace "NAME" with "VALUE" from hash table into file
foreach ($d in $dates) {
$file = $file -replace $d.Name, $d.Value
}

##To save corrected file with new file name
Set-Content -Path C:\hpnew.txt -Value $file


The
$a
array contains (in small part):

June 29, 2016 12:04:51 PM
June 29, 2016 12:05:58 PM
June 29, 2016 12:07:00 PM
[NOTE: LOTS MORE DATES HERE]
June 30, 2016 12:01:17 AM
June 30, 2016 12:02:19 AM
June 30, 2016 12:04:22 AM
[NOTE:CONTINUING TO END]


The
$b
array contains:

06-29-16 12:04:51
06-29-16 12:05:58
06-29-16 12:07:00
[NOTE: LOTS MORE DATES ]
06-30-16 12:01:17
06-30-16 12:02:19
06-30-16 12:04:22
[NOTE: CONTINUING TO END]


There is probably a MUCH simpler, more elegant solution. But any help/direction would be great.

Answer

Use a regular expression to extract the date strings from your text, then pass the matches to a callback function where you parse them to actual DateTime values and format those according to your requirements:

$re = '((?:january|february|...|december) \d{1,2}, \d{4} \d{1,2}:\d{2}:\d{2} [ap]m)'

$input_fmt  = 'MMMM d, yyyy h:mm:ss tt'
$output_fmt = 'MM-dd-yy HH:mm:ss'
$culture    = [Globalization.CultureInfo]::InvariantCulture
$options    = [Text.RegularExpressions.RegexOptions]::IgnoreCase

$callback = {
  [DateTime]::ParseExact($args[0].Groups[1].Value, $input_fmt, $culture).ToString($output_fmt)
}

$txt = Get-Content '.\hp.txt' -Raw
[regex]::Replace($txt, $re, $callback, $options) | Set-Content '.\hpnew.txt'