papaiatis papaiatis - 1 month ago 8
Perl Question

How to filter columns of a CSV file in Perl using Text::CSV module?

I have a CSV file lets say with three columns: id, firstname, lastname.
I'm using Text::CSV to convert this file to a TSV file but I also need to remove the id column.

I ended up this piece of code but it wrongly exports the desired columns: the final file contains the firstname, lastname headers but the values comes from id and firstname instead.

my $fhin;
open($fhin, "<", "input.csv");
my $fhout;
open($fhout, ">", "output.tsv");
my $csv = Text::CSV->new({binary => 1});
my $tsv = Text::CSV->new({binary => 1, sep_char => "\t", eol => "\n", quote_char => undef});
$csv->column_names(qw(firstname lastname));
$tsv->column_names(qw(firstname lastname));
while (my $row = $csv->getline_hr($fhin)) {
$tsv->print_hr($fhout, $row);
}
close($fhin);
close($fhout);


I read the docs here: http://search.cpan.org/~makamaka/Text-CSV-1.33/lib/Text/CSV.pm

What's wrong with my code?

input.csv

"id","firstname","lastname"
1,John,Doe


output.tsv expected

firstname lastname
John Doe


output.tsv actual

firstname lastname
1 John

mob mob
Answer

I don't think

$csv->column_names(qw(firstname lastname))

does what you think it does. You seem to think that it incorporates the data in your file's header row and produces references to the columns that correspond to the names "firstname" and "lastname" (i.e., the 2nd and 3rd column). But what it really does on an input file is to associate names with the columns in your input, regardless of what is in the header row. That is, it assigns the name "firstname" to the 1st column and "lastname" to the 2nd column.

Your code will work* if you change this line to

$csv->column_names(qw(id firstname lastname));

* - you also want to either remove the stray space between "id", and "firstname" in the input or use the setting allow_loose_quotes => 1 in the constructor for $csv.

Comments