SaltedPork SaltedPork - 16 days ago 9
Perl Question

How do I read strings into a hash in Perl

I have a file with a series of random A's, G's, C's and T's in them that look like this:

>Mary
ACGTACGTACGTAC
>Jane
CCCGGCCCCTA
>Arthur
AAAAAAAAAAT


I took those letters and concatinated them to end up with
ACGTACGTACGTACCCCGGCCCCTAAAAAAAAAAT
. I now have a series of positions within that concatenated sequence that are of interest to me, and I want to find the associated Names that match with those positions (coordinates). I'm using the Perl function length to calculate the legnth of each sequence, and then associate the culmulative length with the name in a hash.
So far I have:

#! /usr/bin/perl -w
use strict;

my $seq_input = $ARGV[0];
my $coord_input = $ARGV[1];
my %idSeq; #Stores sequence and associated ID's.

open (my $INPUT, "<$seq_input") or die "unable to open $seq_input";
open (my $COORD, "<$coord_input") or die "unable to open $fcoord_input";

while (<$INPUT>) {
if ($_ = /^[AGCT/) {
$idSeq{$_

my $id = ( /^[>]/)

#put information into a hash
#loop through hash looking for coordinates that are lower than the culmulative length

foreach $id
$totallength = $totallength + length($seq)
$lengthId{$totalLength} = $id
foreach $position
foreach $length
if ($length >= $position) { print; last }

close $fasta_input;
close $coord_input;
print "Done!\n";


So far I'm having trouble reading the file into a hash. Also would I need an array to print the hash?

Answer

Not completely clear on what you want; maybe this:

my $seq;
my %idSeq;
while ( my $line = <$INPUT> ) {
    if ( my ($name) = $line =~ /^>(.*)/ ) {
        $idSeq{$name} = length $seq || 0;
    }
    else {
        chomp $line;
        $seq .= $line;
    }
}

which produces:

$seq = 'ACGTACGTACGTACCCCGGCCCCTAAAAAAAAAAAT';
%idSeq = (
      'Mary' => 0,
      'Jane' => 14,
      'Arthur' => 25,
);