user3781528 user3781528 - 4 months ago 8
Perl Question

Filtering lines from file

I would like to capture lines from a file that contain: "ExprControl" or "5p3pAssays" or "Fusion".

However, I would like to skip lines that contain both “Fusion” and “NoCall”. How do I correctly omit these lines? The code below fails to skip lines that contain both "Fusion" and "NoCall". Thank you.

...
open my $in_fh, '<', $full_tsv_file
or die qq{Unable to open "$full_tsv_file" for input: $!};

while ( <$in_fh> ) {

next if /^#/;
next if /\b(?:Fusion&NoCall)\b/;
next unless /\b(?:ExprControl|5p3pAssays|Fusion)\b/;


my @fields = split('\t');

my $location = $fields[$location_col]; $location =~ s/"//g;
...

Answer

& doesn't mean "and" in regular expressions. Match twice with && instead:

while (<>) {
    next if /^#/ || /\bFusion\b/ && /\bNoCall\b/;
    next unless /\b(?:Fusion|5p3pAssays|ExprControl)\b/;
    print;
}

Tested against:

a ExprControl b
c 5p3pAssays d
e Fusion f NoCall g
h NoCall i Fusion j
k Fusion l
Comments