Phill Pafford Phill Pafford - 2 months ago 9
Perl Question

Perl latin-9? Unicode - need to add support

I have an application that is being expanded to the UK and I will need to add support for Latin-9 Unicode. I have done some Googling but found nothing solid as to what is involved in the process. Any tips?

Here is some code (Just the bits for Unicode stuff)

use Unicode::String qw(utf8 latin1 utf16);

# How to call
$encoded_txt = $self->unicode_encode($item->{value});

# Function part
sub unicode_encode {

shift() if ref($_[0]);
my $toencode = shift();
return undef unless defined($toencode);

Unicode::String->stringify_as("utf8");
my $unicode_str = Unicode::String->new();


# encode Perl UTF-8 string into latin1 Unicode::String
# - currently only Basic Latin and Latin 1 Supplement
# are supported here due to issues with Unicode::String .
$unicode_str->latin1( $toencode );
...


Any help would be great and thanks.

EDIT:
I did find this post: http://czyborra.com/charsets/iso8859.html

cjm cjm
Answer

Unicode::String is ancient, and designed to add Unicode support to older Perls. Modern versions of Perl (5.8.0 and up) have native Unicode support. Look at the Encode module and the :encoding layer. You can get a list of the supported encodings in your Perl with perldoc Encode::Supported.

Basically, you just need to decode/encode to Latin-9 on input & output. The rest of the time, you should use Perl's native UTF-8 strings.

# Read a Latin-9 file:
open(my $in, '<:encoding(Latin9)', 'some/file');
my $line = <$in>; # Automatically converts Latin9 to UTF-8

# Write a Latin-9 file:
open(my $out, '>:encoding(Latin9)', 'other/file');
print $out $line; # Automatically converts UTF-8 to Latin9