BigRedEO BigRedEO - 7 months ago 32
Perl Question

Perl - trim leading zero on field with text and numbers

I'm parsing through a large .CSV file and have run into an issue because of Excel and Microsoft's infinite helpfulness - the .CSV file was opened in Excel to clear up a lot of problems before it even got to me, but now I have a field where a leading zero has been added. Excel doesn't work (at least that I've found) because this particular field is ProductNumber and can have any combination of letters and numbers. The problem comes in because some of the product numbers begin with a dot, i.e. - .12345678. Sometimes the dot comes later - 12.345678, sometimes no dot - 123456789, and sometimes letters and/or letters and numbers mixed - ABCDEFGHI or A12D34G56.

In every case, the field needs to be 9 characters or less. But Excel through in a "helpful" leading zero for any ProductNumber beginning with a dot(decimal) - so I have 0.12345678, turning it into a 10 character ProductNumber. I need to trim that leading zero - only in the cases where the field begins with "0." There are lots of product numbers that begin with "10.", "20." etc. Don't want to touch those.

So, is there a way for me to do this using a scalar variable as I parse through?

So, say my file has

0.12345678
10.123456
A12B34C56
ABCDEFGHI


and I have something like this to start -

my $filename = 'test.csv';

open my $FH, $filename
or die "Could not read from $filename <$!>, program halting.";

# Read the header line.
chomp(my $line = <$FH>);
my @fields = split(/,/, $line);
print Dumper(@fields), $/;

my @data;
# Read the lines one by one.
while($line = <$FH>) {

# split the fields on the comma.
chomp($line);
my @fields = split(/,/, $line);

# Remove leading zero on ProductNumber Field
$_ = for $fields[17];

Answer

I need to trim that leading zero - only in the cases where the field begins with "0."

You can use regular expressions to substitute the leading 0. with just a ., in-effect removing the zero.

my @data;
# Read the lines one by one.
while($line = <$FH>) {

    # split the fields on the comma.
    chomp($line);
    my @fields = split(/,/, $line);

    # Remove leading zero on ProductNumber Field
    $fields[17] =~ s/^\s*0\././;

Let me explain the regular expression a bit

^    - Match the start of the string
\s*  - Match zero or more spaces
0\.  - Match the number zero followed by a dot character
Comments