I have files that has header format starts with > character. Say if the header is in this format: '>anything1|anything2', I use this script to trim header and to get output header '>anything1'.
while (<>) {
if (/^(>[^|]*)/) {
print "$1\n";
} else {
print;
}
}
>anything1|anything2|anything3 bla bla bla /#
>anything1
How about getting out of that regex
while (<>)
{
if (/^>/)
{
my @fields = split '\|', $_;
if (@fields <= 2) { print $fields[0] }
else { print join '|', @fields[0,1] }
next;
}
print;
}
Please consider possible edge cases. It's easy when you have an array.
With regex, one can match cases separately, or carefully come up with one that somehow bundles those two to three different scenarios, which will be far more involved.