Abraham GR Abraham GR - 1 month ago 5
Perl Question

match with 2 or more options perl

I have two formats obtained from qiime analyses, one obtained from silva database and other obtained from GreenGenes. The difference among those files, are that silva files have a progressive D_number for each taxon (kingdom= D_0__, phylum= D_1__, clase= D_2__ and so on) and GreenGenes files have a letter for each taxon (kingdom= K__, phylum= p__, clase= c__ and so on)

file_1 (Silva format)
D_0__Archaea;D_1__Euryarchaeota;D_2__Thermoplasmata;D_3__Thermoplasmatales;D_4__ASC21;D_5__uncultured euryarchaeote



file_2(GreenGenes format)
k__Bacteria;p__Actinobacteria;c__Actinobacteria;o__Actinomycetales;f__Streptomycetaceae;g__Streptomyces


so I made tow scripts (one for Silva and one for GreenGenes) in Perl to extract each taxon in a separate file.

I'm trying to incorporate a piece of code in the match section for both formats, I mean:

in the line 16, I want two options, something like:

my @kingd=($taxon_value[0]=~m/D_0__(.*);D_1/g | m/k__(.*);p/g);


Well, I know that it doesn't work

so how can I add two or more option in the same line for match regex ??

this is part of the script (it have 6 option, I just write the Kingdom option !!):

while (<INPUTFILE>){
$line=$_;
chomp($line);
if ($line=~ m/^#/g){
next;
}
elsif ($line=~ m/^[Uu]nassigned/g){
next;
}
elsif ($line){
my @full_line = $_;
foreach (@full_line){
my (@taxon_value)= split (/\t/, $_);
foreach ($taxon_value[0]){
if ($kingdom){
my @kingd=($taxon_value[0]=~m/D_0__(.*);D_1/g); # just for silva
foreach (@kingd){
if ($_=~/^$/){
next;
}
elsif ($_=~ m/^[Uu]nknown/g){
next;
}
elsif ($_=~ m/^[Uu]ncultured$/g){
next;
}
elsif ($_=~ m/^[Uu]nidentified$/g){
next;
}
else {
push @taxon_list, $_;
}
}
}
}
}
}


thanks

Answer

You need to do the or inside of your pattern. You do that with a pipe |, which you already had. But it needs to go into the pattern. No need to have two match operators.

my @kingd = $taxon_value[0] =~ m/D_0__(.*);D_1|k__(.*);p/g

It will now match either the one, or the other. See perlre and perlretut for more information. You should also read the information provided in the regex tag wiki here on SO as it contains links to many useful tools.

What you were doing in your code that didn't work is using Perl's | operator, which is a bitwise or.

Comments