dejvid dejvid - 5 months ago 11
Perl Question

Perl - en / em dash in command line arguments

I'm having a problem with my perl script with parsing command line arguments. Mainly, I'd like perl to parse argument preceding with (em/en)-dash as well as hypen.
Please consider the following command execution:

my_spript.pl -firstParam someValue –secondParam someValue2


As you can see, firstParam is prefixed with hypen and there is no problem with perl parsing it, but the secondParam is prefixed with en-dash and unfortunately Perl cannot recognize it as an argument.
I am using GetOptions() to parse arguments:

GetOptions(
"firstParam" => \$firstParam,
"secondParam" => \$secondParam
)

Answer

If you're using Getopt::Long, you can preprocess the arguments before giving them to GetOptions:

#! /usr/bin/perl
use warnings;
use strict;

use Getopt::Long;

s/^\xe2\x80\x93/-/ for @ARGV;

GetOptions('firstParam:s'  => \ my $first_param,
           'secondParam:s' => \ my $second_param);
print "$first_param, $second_param\n";

It might be cleaner to first decode the arguments, though:

use Encode;

$_ = decode('UTF-8', $_), s/^\N{U+2013}/-/ for @ARGV;

To work in different locale setting, use Encode::Locale:

#! /usr/bin/perl
use warnings;
use strict;

use Encode::Locale;
use Encode;
use Getopt::Long;

$_ = decode(locale => $_), s/^\N{U+2013}/-/ for @ARGV;

GetOptions('firstParam:s'  => \ my $first_param,
           'secondParam:s' => \ my $second_param);
print "$first_param, $second_param\n";
Comments