bms9nmh bms9nmh - 3 months ago 9
Perl Question

Why does the size of my hash (from a csv) read exactly half the value of the number of lines?

I've written a simple perl script to calculate the number of value and/or keys in a hash, which was created from the content of a csv. The csv looks like this:

311552047969,THE UPSETTERS RETURN OF THE SUPER APE VINYL LP 1978 ,http://www.ebay.co.uk/itm/UPSETTERS-RETURN-SUPER-APE-VINYL-LP-1978-/311552047969,56.0
322016291276,Queen A Kind Of Magic NZ Orange Vinyl,http://www.ebay.co.uk/itm/Queen-Kind-Magic-NZ-Orange-Vinyl-/322016291276,165.0
252288285264,Goldfrapp Black cherry vinyl record lp,http://www.ebay.co.uk/itm/Goldfrapp-Black-cherry-vinyl-record-lp-/252288285264,70.0
331782523967,Reggae vinyl johny pram pram ,http://www.ebay.co.uk/itm/Reggae-vinyl-johny-pram-pram-/331782523967,73.0
391392294381,Various vinyl albums,http://www.ebay.co.uk/itm/Various-vinyl-albums-/391392294381,102.24


Here is my script to calculate the number of lines.

#!/bin/perl


open CSV2, "<csv2" or die;
@csv2=<CSV2>;
close CSV2;


%hash = @csv2;

@keys = keys %hash;
@values = values %hash;

$size = @values;
print "Hash size is $size";


The actual number of lines in the csv is 6374, but the output from my code is saying exactly half that- 3187.

I'm sure there is a simple explanation for this, but why doesn't the size of the hash (i.e. the number of values/keys), match the number of lines in my csv?

Answer

When you assign a list to a hash, Perl assumes the list is a list of key, value pairs. So you are populating a hash in which key

'311552047969,THE UPSETTERS   RETURN OF THE SUPER APE VINYL LP 1978 ,http://www.ebay.co.uk/itm/UPSETTERS-RETURN-SUPER-APE-VINYL-LP-1978-/311552047969,56.0'

has value

'322016291276,Queen A Kind Of Magic NZ Orange Vinyl,http://www.ebay.co.uk/itm/Queen-Kind-Magic-NZ-Orange-Vinyl-/322016291276,165.0'

and so on.

Comments